AI智能摘要
深入解析了Linux中断系统的核心概念,包括硬件中断号(hwirq)与虚拟中断号(virq)的区别、各类中断号在中断控制器中的分配和映射关系,以及物理与逻辑中断线的实际意义。强调Linux内核通过virq实现统一管理,解决硬件中断号冲突,展示了复杂中断结构在SoC中的连接方式,为开发者理解和排查多级中断控制器的中断流程问题提供理论知识的指导。
此摘要由AI分析文章内容生成,仅供参考。

本文承上文:

【aarch64异常模型以及Linux arm64中断处理】

【Linux级联中断控制器注册与中断处理】

中断相关的几个概念

硬件中断号hwirq

hwirq - Hardware Interrupt Number,

定义:物理硬件设备在中断控制器中的原始编号,由硬件设计固定。

每一个中断控制器都会对连接在自己控制器下面的设备进行编号

例如:

  1. 在GIC中,中断号0-15为SGI(软件生成中断),16-31为PPI(私有外设中断),32-1019为SPI(共享外设中断)

	tlmm: pinctrl@400000 {
		compatible = "qcom,pitti-pinctrl";                  // 与驱动匹配
		reg = <0x400000 0x1000000>;
		interrupts = <GIC_SPI 227 IRQ_TYPE_LEVEL_HIGH>;     // 表示其连接在GIC中断控制器的227号共享类型中断上
		gpio-controller;                                    // 标明是一个gpio控制器
		#gpio-cells = <2>;
		interrupt-controller;                               // 标明是一个中断控制器
		#interrupt-cells = <2>;
		wakeup-parent = <&mpm>;
		qcom,gpios-reserved = <36 37 38 39>;
	};

这个tlmm是级联在GIC中断控制器下的227号硬件中断(hwirq)上,此时从GIC角度来看,TLMM是一个外设设备,这种情况下我们一般称之为INTID

GICv3_trm第2.2节
Gicv3_trm第2.2节

  1. gpio在级联中断控制器上的hwirq

// 假设一个按钮设备使用TLMM的GPIO 5作为中断
button@0 {
    compatible = "gpio-keys";
    
    button0 {
        label = "Volume Up";
        gpios = <&tlmm 5 GPIO_ACTIVE_LOW>;
        // 中断描述:使用tlmm控制器的GPIO 5,下降沿触发
        interrupt-parent = <&tlmm>;
        interrupts = <5 IRQ_TYPE_EDGE_FALLING>;
    };
};

这个按键就是使用了TLMM的gpio5,它的中断源是TLMM,在tlmm内部的硬件中断号为5

这里提一个有意思的,在高通的irq中

  23 0x30002      0x3        19         0          0          0          0          0          0          0           qcom,smp2p-adsp                ipcc            v.v (struct irq_desc *)0xffffff8002331a00    
  24 0x60002      0x3        4          0          0          0          0          0          0          0           qcom,smp2p-cdsp                ipcc            v.v (struct irq_desc *)0xffffff8002333000   

可以看到这两个中断的hwirq非常大

0x00030002
    ├── 高16位 (0x0003): 可能表示中断控制器ID或bank编号
    └── 低16位 (0x0002): 该控制器上的具体中断线

由于硬件中断号是由所属的中断控制器决定的,那么就有可能出现硬件中断号一致的情况。为了解决这个问题,linux kernel引入了虚拟中断号的概念

虚拟中断号virq

virq - Virtual Interrupt Number

定义:Linux内核抽象的统一中断编号,用于在内核中标识和管理中断。

特性

  • 由内核动态分配和管理

  • 全局统一编号空间(0~NR_IRQS-1)

  • 独立于硬件实现,提供一致的编程接口

cat /proc/interrupts 会有这样的输出

IRQ  HWIRQ        affinity   CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7        Name                           Chip            IRQ Structure       
   1 0x0          0x3        9868       10031      8613       9408       8928       9616       25598      26045       IPI                            GICv3           v.v (struct irq_desc *)0xffffff8045756a00    
   2 0x1          0x3        5330       2620       2012       1851       1709       1703       17407      13294       IPI                            GICv3           v.v (struct irq_desc *)0xffffff8045755800    
   3 0x2          0x3        1          1          1          1          1          1          1          0           IPI                            GICv3           v.v (struct irq_desc *)0xffffff8045757c00    
   4 0x3          0x3        0          0          0          0          0          0          0          0           IPI                            GICv3           v.v (struct irq_desc *)0xffffff8045754e00    
   5 0x4          0x3        0          0          0          0          0          0          0          1           IPI                            GICv3           v.v (struct irq_desc *)0xffffff8045757800    
   6 0x5          0x3        8902       4995       4536       4901       4530       4886       11392      12431       IPI                            GICv3           v.v (struct irq_desc *)0xffffff8045756e00    
   7 0x6          0x3        0          0          0          0          0          0          0          0           IPI                            GICv3           v.v (struct irq_desc *)0xffffff8045756600    
  11 0x13         0x3        3487       3122       3065       2972       2950       2987       11748      12069       arch_timer                     GICv3           v.v (struct irq_desc *)0xffffff8045757600    
  13 0x26         0x1        26         6          0          0          7          3          13         10          arch_mem_timer                 GICv3           v.v (struct irq_desc *)0xffffff8045754200    
  15 0x15         0x3        11         10         10         9          9          9          49         50          arm-pmu                        GICv3           v.v (struct irq_desc *)0xffffff800b2e1800    
  16 0x16e        0x3        12345      0          0          0          0          0          0          0           ipcc_0                         GICv3           v.v (struct irq_desc *)0xffffff80020f8c00    
  17 0xe5         0x3        0          0          0          0          0          0          0          0           mpm                            GICv3           v.v (struct irq_desc *)0xffffff800b2e4c00    
  18 0x56         0x3        0          0          0          0          0          0          0          0           None                           mpm-gic         v.v (struct irq_desc *)0xffffff8002468a00    
  19 0x0          0x3        0          0          0          0          0          0          0          0           volume_up                      spmi-gpio       v.v (struct irq_desc *)0xffffff8002468c00    
  20 0x103        0x3        0          0          0          0          0          0          0          0           None                           GICv3           v.v (struct irq_desc *)0xffffff80457c2600    
  21 0x61         0x3        0          0          0          0          0          0          0          0           wifi_ant_check                 msmgpio         v.v (struct irq_desc *)0xffffff80023c6400    
  22 0x20         0x3        0          0          0          0          0          0          0          0           apps_wdog_bark                 GICv3           v.v (struct irq_desc *)0xffffff8002235800    
  23 0x30002      0x3        19         0          0          0          0          0          0          0           qcom,smp2p-adsp                ipcc            v.v (struct irq_desc *)0xffffff8002331a00  

从图中可以得到以下的结论:

  • 第一列就是virq,第二列就是hwirq

  • virq是唯一的,但是hwirq不是唯一的

  • 从hwirq到virq必然存在着一条映射关系

中断线

中断线”就是:硬件里一根用来把“某个设备有事要 CPU 处理”的信号送到中断控制器/CPU 的线路(信号通路)

但在 Linux 里大家说“中断线”经常有两层含义:物理逻辑

物理中断线

  • 在 SoC/板级电路上,外设会有一个 IRQ 引脚(或内部连线)。

  • 这根线接到中断控制器(比如 GIC 的某个 SPI 输入、GPIO 控制器的某个中断输入)。

  • 线上的电平/脉冲变化(高/低、上升沿/下降沿)就代表“中断请求”。

例子:

  • 网卡 IRQ 引脚 → 接到 GIC 的 SPI#74

  • 某 GPIO 引脚作为中断输入 → 接到 GPIO 控制器,再由 GPIO 控制器汇聚到 GIC 的某个 SPI

这就是“线”的本体。

逻辑中断线

在中断控制器内部,“每个输入/每个可识别的中断源”通常都会对应一个硬件中断号(hwirq/INTID)。

这时大家也会把“某个 hwirq”称为“一条中断线”。

  • GIC:一条“线”常常就对应一个 INTID(比如 SPI INTID=74)。

  • GPIO 控制器:一个 GPIO pin 作为中断源,也可以认为是一条“线”(GPIO 控制器里有自己的 hwirq 号)。

这里提一个知识点:共享中断

共享中断就是:多个设备共用同一根物理线/同一个硬件中断输入,所以在 Linux 里表现为:

  • 多个 irqaction 挂在同一个 irq_desc 上(同一个 virq 上注册了多个 handler)

  • 中断来了要挨个问:“是不是你触发的?”

中断相关的数据结构

中断描述符irq_desc

/**
 * struct irq_desc - interrupt descriptor
 * @irq_common_data:	per irq and chip data passed down to chip functions
 * @kstat_irqs:		irq stats per cpu
 * @handle_irq:		highlevel irq-events handler
 * @action:		the irq action chain
 * @status_use_accessors: status information
 * @core_internal_state__do_not_mess_with_it: core internal status information
 * @depth:		disable-depth, for nested irq_disable() calls
 * @wake_depth:		enable depth, for multiple irq_set_irq_wake() callers
 * @tot_count:		stats field for non-percpu irqs
 * @irq_count:		stats field to detect stalled irqs
 * @last_unhandled:	aging timer for unhandled count
 * @irqs_unhandled:	stats field for spurious unhandled interrupts
 * @threads_handled:	stats field for deferred spurious detection of threaded handlers
 * @threads_handled_last: comparator field for deferred spurious detection of threaded handlers
 * @lock:		locking for SMP
 * @affinity_hint:	hint to user space for preferred irq affinity
 * @affinity_notify:	context for notification of affinity changes
 * @pending_mask:	pending rebalanced interrupts
 * @threads_oneshot:	bitfield to handle shared oneshot threads
 * @threads_active:	number of irqaction threads currently running
 * @wait_for_threads:	wait queue for sync_irq to wait for threaded handlers
 * @nr_actions:		number of installed actions on this descriptor
 * @no_suspend_depth:	number of irqactions on a irq descriptor with
 *			IRQF_NO_SUSPEND set
 * @force_resume_depth:	number of irqactions on a irq descriptor with
 *			IRQF_FORCE_RESUME set
 * @rcu:		rcu head for delayed free
 * @kobj:		kobject used to represent this struct in sysfs
 * @request_mutex:	mutex to protect request/free before locking desc->lock
 * @dir:		/proc/irq/ procfs entry
 * @debugfs_file:	dentry for the debugfs file
 * @name:		flow handler name for /proc/interrupts output
 */
struct irq_desc {
	struct irq_common_data	irq_common_data;
	struct irq_data		irq_data;
	unsigned int __percpu	*kstat_irqs;        // 中断统计
	irq_flow_handler_t	handle_irq;             // 流处理函数
	struct irqaction	*action;	/* IRQ action list */   // IRQ动作链表
	unsigned int		status_use_accessors;
	unsigned int		core_internal_state__do_not_mess_with_it;
	unsigned int		depth;		/* nested irq disables */   // 禁止嵌套计数
	unsigned int		wake_depth;	/* nested wake enables */
	unsigned int		tot_count;
	unsigned int		irq_count;	/* For detecting broken IRQs */
	unsigned long		last_unhandled;	/* Aging timer for unhandled count */
	unsigned int		irqs_unhandled;
	atomic_t		threads_handled;
	int			threads_handled_last;
	raw_spinlock_t		lock;
	struct cpumask		*percpu_enabled;
	const struct cpumask	*percpu_affinity;
#ifdef CONFIG_SMP
	const struct cpumask	*affinity_hint;
	struct irq_affinity_notify *affinity_notify;
#ifdef CONFIG_GENERIC_PENDING_IRQ
	cpumask_var_t		pending_mask;
#endif
#endif
	unsigned long		threads_oneshot;
	atomic_t		threads_active;
	wait_queue_head_t       wait_for_threads;
#ifdef CONFIG_PM_SLEEP
	unsigned int		nr_actions;
	unsigned int		no_suspend_depth;
	unsigned int		cond_suspend_depth;
	unsigned int		force_resume_depth;
#endif
#ifdef CONFIG_PROC_FS
	struct proc_dir_entry	*dir;
#endif
#ifdef CONFIG_GENERIC_IRQ_DEBUGFS
	struct dentry		*debugfs_file;
	const char		*dev_name;
#endif
#ifdef CONFIG_SPARSE_IRQ
	struct rcu_head		rcu;
	struct kobject		kobj;
#endif
	struct mutex		request_mutex;
	int			parent_irq;
	struct module		*owner;
	const char		*name;
} ____cacheline_internodealigned_in_smp;

可以把 irq_desc 理解成:Linux 对某个 virq 的“全生命周期管理对象”,核心信息都挂在这上面。

核心成员介绍:

这条 IRQ 用什么“流控/派发逻辑”

handle_irqflow handler

  • 决定:是 level/edge?是否 EOI?是否 percpu?是否 oneshot?

  • 常见函数:handle_level_irq / handle_edge_irq / handle_fasteoi_irq / handle_percpu_devid_irq ...

注意:handle_irq 不是设备驱动的 handler,它是 IRQ core 级的“派发器”。

这条 IRQ 上挂了哪些“设备处理函数”

action:指向 struct irqaction 的链表头

  • 共享中断时:一个 irq_desc 挂多个 irqaction

  • 非共享时:通常只有一个 irqaction

这条 IRQ 的硬件相关信息

  • irq_data(或嵌套结构):里面有 irqhwirqdomainchipchip_data

  • irq_chip:这条线由哪个 irqchip 驱动控制(mask/unmask/ack/eoi/set_type…)

状态与同步

  • 自旋锁(如 desc->lock):保护 action 链表、状态位等

  • 深度/禁用计数、pending、统计、亲和性、唤醒等状态

中断域irq_domain

struct irq_domain {
	struct list_head link;
	const char *name;
	const struct irq_domain_ops *ops;
	void *host_data;
	unsigned int flags;
	unsigned int mapcount;

	/* Optional data */
	struct fwnode_handle *fwnode;
	enum irq_domain_bus_token bus_token;
	struct irq_domain_chip_generic *gc;
	struct device *dev;
#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
	struct irq_domain *parent;
#endif

	ANDROID_KABI_RESERVE(1);
	ANDROID_KABI_RESERVE(2);
	ANDROID_KABI_RESERVE(3);
	ANDROID_KABI_RESERVE(4);

	/* reverse map data. The linear map gets appended to the irq_domain */
	irq_hw_number_t hwirq_max;
	unsigned int revmap_size;
	struct radix_tree_root revmap_tree;
	struct mutex revmap_mutex;
	struct irq_data __rcu *revmap[];
};

irq_domain中断号翻译/分配体系的核心对象

常见重要成员:

  • opsirq_domain_ops.translate/.alloc/.free/.select

  • revmap[] / revmap_tree:hwirq→irq_data 的表(GIC 通常用线性数组 revmap)

  • parent:层级 domain 时指向父 domain

  • hwirq_max / revmap_size:线性映射范围

几个关键点总结:

每个 irq_domain 都有自己的 revmaprevmap_tree

  • revmap 是一个 线性数组,通常用于 较小或密集的中断号,它用于存储 hwirq → irq_data 的映射。

  • revmap_tree 是一个 radix tree(类似哈希表),通常用于 更大范围的中断号(当中断号的数量比较大时,它提供了更好的查找效率)。

hwirq 是 irq_domain 自己维护的

  • 每个 irq_domain 都维护自己的 revmap,因此一个 GIC domain 和一个 GPIO domain 会有各自独立的 hwirq 范围,它们的 hwirq 号是相互独立的

  • 比如 GIC domain 会有一组 INTID 号,和 GPIO domainGPIO_PIN 号完全不同,且是 分别管理 的。

如何查找 hwirq 对应的 virq

  • 如果指定了 domain(例如 GIC domain 或 GPIO domain),__irq_resolve_mapping 会直接根据传入的 hwirq 查找对应的 irq_desc(以及对应的 virq)。

  • 它首先会检查 revmap,如果没有找到(可能是由于 hwirq 较大),则会查找 revmap_tree

struct irq_domain_ops {
	int (*match)(struct irq_domain *d, struct device_node *node,
		     enum irq_domain_bus_token bus_token);
	int (*select)(struct irq_domain *d, struct irq_fwspec *fwspec,
		      enum irq_domain_bus_token bus_token);
	int (*map)(struct irq_domain *d, unsigned int virq, irq_hw_number_t hw);
	void (*unmap)(struct irq_domain *d, unsigned int virq);
	int (*xlate)(struct irq_domain *d, struct device_node *node,
		     const u32 *intspec, unsigned int intsize,
		     unsigned long *out_hwirq, unsigned int *out_type);
#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
	/* extended V2 interfaces to support hierarchy irq_domains */
	int (*alloc)(struct irq_domain *d, unsigned int virq,
		     unsigned int nr_irqs, void *arg);
	void (*free)(struct irq_domain *d, unsigned int virq,
		     unsigned int nr_irqs);
	int (*activate)(struct irq_domain *d, struct irq_data *irqd, bool reserve);
	void (*deactivate)(struct irq_domain *d, struct irq_data *irq_data);
	int (*translate)(struct irq_domain *d, struct irq_fwspec *fwspec,
			 unsigned long *out_hwirq, unsigned int *out_type);
#endif
#ifdef CONFIG_GENERIC_IRQ_DEBUGFS
	void (*debug_show)(struct seq_file *m, struct irq_domain *d,
			   struct irq_data *irqd, int ind);
#endif
};
  • .translate:把 DT/ACPI 的中断描述翻译成 hwirq + type/flags

  • .select:在多 domain 情况下选哪个 domain

  • .alloc当需要“新建映射”时分配一批 virq 并完成初始化

    • 分配 virq 号( __irq_alloc_descs() 就是其中关键步骤)

    • 建立 revmap

    • 设置 irq_chip / flow handler 等信息

  • .free:释放映射(动态中断源/卸载时)

struct irq_data

很多时候会发现函数参数是 struct irq_data *d 而不是 irq_desc*irq_data 是更“硬件抽象层”的那层。它将 desc / chip / domain / hwirq/virq 串在一起。

struct irq_data {
	u32			mask;
	unsigned int		irq;
	unsigned long		hwirq;
	struct irq_common_data	*common;
	struct irq_chip		*chip;
	struct irq_domain	*domain;
#ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
	struct irq_data		*parent_data;
#endif
	void			*chip_data;
};

典型字段:

  • irq:virq

  • hwirq:硬件号(GIC INTID)

  • domain:所属 irq_domain

  • chip:irq_chip 指针

  • chip_data:irqchip 私有数据(比如 GIC 的指针/寄存器基址信息等)

struct irq_chip

struct irq_chip {
	const char	*name;
	unsigned int	(*irq_startup)(struct irq_data *data);
	void		(*irq_shutdown)(struct irq_data *data);
	void		(*irq_enable)(struct irq_data *data);
	void		(*irq_disable)(struct irq_data *data);

	void		(*irq_ack)(struct irq_data *data);
	void		(*irq_mask)(struct irq_data *data);
	void		(*irq_mask_ack)(struct irq_data *data);
	void		(*irq_unmask)(struct irq_data *data);
	void		(*irq_eoi)(struct irq_data *data);

	int		(*irq_set_affinity)(struct irq_data *data, const struct cpumask *dest, bool force);
	int		(*irq_retrigger)(struct irq_data *data);
	int		(*irq_set_type)(struct irq_data *data, unsigned int flow_type);
	int		(*irq_set_wake)(struct irq_data *data, unsigned int on);

	void		(*irq_bus_lock)(struct irq_data *data);
	void		(*irq_bus_sync_unlock)(struct irq_data *data);

#ifdef CONFIG_DEPRECATED_IRQ_CPU_ONOFFLINE
	void		(*irq_cpu_online)(struct irq_data *data);
	void		(*irq_cpu_offline)(struct irq_data *data);
#endif
	void		(*irq_suspend)(struct irq_data *data);
	void		(*irq_resume)(struct irq_data *data);
	void		(*irq_pm_shutdown)(struct irq_data *data);

	void		(*irq_calc_mask)(struct irq_data *data);

	void		(*irq_print_chip)(struct irq_data *data, struct seq_file *p);
	int		(*irq_request_resources)(struct irq_data *data);
	void		(*irq_release_resources)(struct irq_data *data);

	void		(*irq_compose_msi_msg)(struct irq_data *data, struct msi_msg *msg);
	void		(*irq_write_msi_msg)(struct irq_data *data, struct msi_msg *msg);

	int		(*irq_get_irqchip_state)(struct irq_data *data, enum irqchip_irq_state which, bool *state);
	int		(*irq_set_irqchip_state)(struct irq_data *data, enum irqchip_irq_state which, bool state);

	int		(*irq_set_vcpu_affinity)(struct irq_data *data, void *vcpu_info);

	void		(*ipi_send_single)(struct irq_data *data, unsigned int cpu);
	void		(*ipi_send_mask)(struct irq_data *data, const struct cpumask *dest);

	int		(*irq_nmi_setup)(struct irq_data *data);
	void		(*irq_nmi_teardown)(struct irq_data *data);

	unsigned long	flags;
};

irq_chip 是 irqchip 驱动提供的一组方法,让通用 IRQ core 能控制具体硬件:

常见回调:

  • .irq_mask / .irq_unmask

  • .irq_ack(edge/某些场景)

  • .irq_eoi(EOI 模式,GIC 常用)

  • .irq_set_type(edge/level)

  • .irq_set_affinity(路由到哪个 CPU)

对 GIC:中断进来后 flow handler 往往会在合适的时机调用 chip 的 ack/eoi/mask/unmask。


在我们了解了基础数据结构后,我们会对中断处理有一个比较直观的认知:

一次中断发生时,调用栈“角色分工”总结

  • irq_domain:把 INTID(hwirq) 找到 virq/desc(映射层)

  • irq_desc->handle_irq:按中断类型做正确的 ack/eoi/mask、以及调用 action(流控层)

  • irqaction->handler/thread_fn:设备驱动的真正处理逻辑(业务层)

中断初始化

具体的流程就是这样,大家自己看把,有了流程图还是比较清晰的!在本文开头提到的两篇文章中也对其有所赘述。

如果理解有困难,可以通过QEMU以及gdb来设置断点来理清这部分的代码,可以实时查看断点处的变量以及堆栈

下面以问答的形式回答一下关键的问题点

中断描述符是何时被创建的?

一个设备驱动“用到的那条 IRQ”的 irq_desc 一般不是在 request_irq() 时才创建,而是更早一步:当这条中断第一次被“分配出一个 virq 并建立映射”时就创建了

比如你在 probe 里调用:

  • platform_get_irq(pdev, 0)

  • of_irq_get(np, 0) / irq_of_parse_and_map(np, 0)

如果这条 DT 中断(对应某个 GIC INTID/hwirq)还没有映射到 Linux virq,内核会:

  1. 分配一个新的 virq(全局号)

  2. 创建这个 virq 对应的 irq_desc

  3. 建立 hwirq ↔ virq 映射(revmap)
    然后把这个 virq 返回给驱动。

所以很多驱动里会看到:先 platform_get_irq(),再 devm_request_irq() —— irq_desc 往往在前一步就已经有了。

platform_get_irq()
→ of_irq_get() / irq_of_parse_and_map()(drivers/of/irq.c)
→ irq_create_fwspec_mapping()(kernel/irq/irqdomain.c)
→ irq_domain_alloc_irqs() / irq_domain_alloc_irqs_hierarchy()
→ __irq_alloc_descs()(kernel/irq/irqdesc.c)
→ alloc_descs() 创建并初始化 irq_desc
→ domain->ops->alloc() 完成 chip/handler/hwirq 绑定、建立 revmap

这里需要注意的就是:共享中断,如果设备 IRQ 是 共享线(IRQF_SHARED) 或者该 INTID 已经被别人先映射过:

  • 这条 virq 对应的 irq_desc 早已存在

  • request_irq() 只是在 desc->action 链表上再挂一个 irqaction

中断hwirq和virq的映射

关于hwirq在第一节中我们就提到分为GIC中断控制器和级联的中断控制器

关键函数:

unsigned int irq_create_fwspec_mapping(struct irq_fwspec *fwspec)
{
	struct irq_domain *domain;
	struct irq_data *irq_data;
	irq_hw_number_t hwirq;
	unsigned int type = IRQ_TYPE_NONE;
	int virq;
    
    // 目的找到fwspec对应的irq_domain
	if (fwspec->fwnode) {
		domain = irq_find_matching_fwspec(fwspec, DOMAIN_BUS_WIRED);
		if (!domain)
			domain = irq_find_matching_fwspec(fwspec, DOMAIN_BUS_ANY);
	} else {
		domain = irq_default_domain;
	}

	if (!domain) {
		pr_warn("no irq domain found for %s !\n",
			of_node_full_name(to_of_node(fwspec->fwnode)));
		return 0;
	}
    // 调用irq_domain_ops中的translate函数翻译,获取hwirq
	if (irq_domain_translate(domain, fwspec, &hwirq, &type))
		return 0;
    // 从相应irq_domain维护的revmap或者revmap_tree中查找hwirq,获取其virq
    virq = irq_find_mapping(domain, hwirq);
    //...
	if (irq_domain_is_hierarchy(domain)) {
		virq = irq_domain_alloc_irqs(domain, 1, NUMA_NO_NODE, fwspec);
		if (virq <= 0)
			return 0;
	} else {
		/* Create mapping */
		virq = irq_create_mapping(domain, hwirq);
		if (!virq)
			return virq;
	}
}

irq_create_mapping创建映射

unsigned int irq_create_mapping_affinity(struct irq_domain *domain,
				       irq_hw_number_t hwirq,
				       const struct irq_affinity_desc *affinity)
{
	struct device_node *of_node;
	int virq;

	pr_debug("irq_create_mapping(0x%p, 0x%lx)\n", domain, hwirq);

	/* Look for default domain if necessary */
	if (domain == NULL)
		domain = irq_default_domain;
	if (domain == NULL) {
		WARN(1, "%s(, %lx) called with NULL domain\n", __func__, hwirq);
		return 0;
	}
	pr_debug("-> using domain @%p\n", domain);

	of_node = irq_domain_get_of_node(domain);

	/* Check if mapping already exists */
	virq = irq_find_mapping(domain, hwirq);
	if (virq) {
		pr_debug("-> existing mapping on virq %d\n", virq);
		return virq;
	}

	/* Allocate a virtual interrupt number */
	virq = irq_domain_alloc_descs(-1, 1, hwirq, of_node_to_nid(of_node),
				      affinity);
	if (virq <= 0) {
		pr_debug("-> virq allocation failed\n");
		return 0;
	}

	if (irq_domain_associate(domain, virq, hwirq)) {
		irq_free_desc(virq);
		return 0;
	}

	pr_debug("irq %lu on domain %s mapped to virtual irq %u\n",
		hwirq, of_node_full_name(of_node), virq);

	return virq;
}

irq_domain_alloc_descs 分配

	start = bitmap_find_next_zero_area(allocated_irqs, IRQ_BITMAP_BITS,
					   from, cnt, 0);

找一个没有使用过的virq返回

irq_data的分配

GIC中断控制器的irq_data分配

当设备是位于GIC中断控制器时,分配函数走的是gic_irq_domain_ops的alloc函数,也就gic_irq_domain_alloc

(1) 填 irq_data 的桥接信息

virq/hwirq/domain 串起来。

(2) 绑定 irqchip + flow handler(决定 desc->handle_irq

通常通过 irq_domain_set_info()irq_set_chip_and_handler*() 之类完成:

  • 绑定 irq_chip(GIC 的 chip)

  • 设置 flow handler(很多 SPI/PPI 常见是 handle_fasteoi_irq

(3) 最关键:写入 revmap(hwirq → irq_data)

最终必须把这条映射放进 domain 的反向映射表,否则你中断路径 __irq_resolve_mapping() 查不到。

级联中断控制器上的irq_data分配

子控制器内部的 N 个中断源:走子控制器自己的 irq_domain 映射

子控制器通常会自己创建一个 irq_domain(例如 gpio-domain),把“GPIO pin号/子中断号”当作 hwirq:

  • 设备 A 使用 GPIO#17 做中断:它的 interrupt-parent = <&gpio>; interrupts = <17 ...>;

  • 当设备 A probe 时 platform_get_irq() 解析到 gpio_domain,然后会触发:

    • gpio_domain->ops->alloc()(或旧的 .map()

    • 分配一个新的 virq/desc,并把 hwirq=17 → virq 写进 gpio_domain 的 revmap

❌ 这一步 不走 gic_irq_domain_alloc(),因为它根本不是 GIC domain 的中断。

案例:高通平台tlmm中断控制器

这里可以找一个案例:高通平台的tlmm中断控制器,代码位于:drivers/pinctrl/qcom/pinctrl-msm.c

设备树中它是这么显示的:

	tlmm: pinctrl@400000 {
		compatible = "qcom,pitti-pinctrl";                  // 与驱动匹配
		reg = <0x400000 0x1000000>;
		interrupts = <GIC_SPI 227 IRQ_TYPE_LEVEL_HIGH>;     // 表示其连接在GIC中断控制器的227号共享类型中断上
		gpio-controller;                                    // 标明是一个gpio控制器
		#gpio-cells = <2>;
		interrupt-controller;                               // 标明是一个中断控制器
		#interrupt-cells = <2>;
		wakeup-parent = <&mpm>;
		qcom,gpios-reserved = <36 37 38 39>;
	};

很明显tlmm是一个中断控制器,同时它也是一个gpio控制器

如我们之前的分析,irq-domain的创建以及virq与hwirq的映射是通过irq_domain_ops.alloc来设置的,但是翻遍整个pinctrl-msm.c都没有发现irq_domain_ops的设置。那是不是意味着我们之前分析的是有问题的???

其实不然,由于tlmm同时也是一个gpio中断控制器,tlmm的驱动将irq_domain_ops的设置放在了gpiolib里。

static const struct irq_chip msm_gpio_irq_chip = {
	.name			= "msmgpio",
	.irq_enable		= msm_gpio_irq_enable,
	.irq_disable		= msm_gpio_irq_disable,
	.irq_mask		= msm_gpio_irq_mask,
	.irq_unmask		= msm_gpio_irq_unmask,
	.irq_ack		= msm_gpio_irq_ack,
	.irq_eoi		= msm_gpio_irq_eoi,
	.irq_set_type		= msm_gpio_irq_set_type,
	.irq_set_wake		= msm_gpio_irq_set_wake,
	.irq_request_resources	= msm_gpio_irq_reqres,
	.irq_release_resources	= msm_gpio_irq_relres,
	.irq_set_affinity	= msm_gpio_irq_set_affinity,
	.irq_set_irqchip_state = msm_gpio_irq_set_irqchip_state,
	.irq_get_irqchip_state = msm_gpio_irq_get_irqchip_state,
	.irq_set_vcpu_affinity	= msm_gpio_irq_set_vcpu_affinity,
	.flags			= (IRQCHIP_MASK_ON_SUSPEND |
				   IRQCHIP_SET_TYPE_MASKED |
				   IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND |
				   IRQCHIP_IMMUTABLE),
};

static int msm_gpio_init(struct msm_pinctrl *pctrl)
{
	struct gpio_chip *chip;
	struct gpio_irq_chip *girq;
//...
	chip = &pctrl->chip;      //msm_pinctrl的chip成员
	chip->base = -1;
	chip->ngpio = ngpio;
	chip->label = dev_name(pctrl->dev);
	chip->parent = pctrl->dev;
	chip->owner = THIS_MODULE;
    //...
	girq = &chip->irq;         // 填充msm_pinctrl的chip结构中的irq成员
	gpio_irq_chip_set_chip(girq, &msm_gpio_irq_chip);
	girq->parent_handler = msm_gpio_irq_handler;
	girq->fwnode = pctrl->dev->fwnode;
	girq->num_parents = 1;
	girq->parents = devm_kcalloc(pctrl->dev, 1, sizeof(*girq->parents),
				     GFP_KERNEL);
	if (!girq->parents)
		return -ENOMEM;
	girq->default_type = IRQ_TYPE_NONE;
	girq->handler = handle_bad_irq;
	girq->parents[0] = pctrl->irq;

	ret = gpiochip_add_data(&pctrl->chip, pctrl);  //注册irq_domain
    //...
}

这个girqstruct gpio_irq_chip

/**
 * struct gpio_irq_chip - GPIO interrupt controller
 */
struct gpio_irq_chip {
	/**
	 * @chip:
	 *
	 * GPIO IRQ chip implementation, provided by GPIO driver.
	 */
	struct irq_chip *chip;

	/**
	 * @domain:
	 *
	 * Interrupt translation domain; responsible for mapping between GPIO
	 * hwirq number and Linux IRQ number.
	 */
	struct irq_domain *domain;

	/**
	 * @domain_ops:
	 *
	 * Table of interrupt domain operations for this IRQ chip.
	 */
	const struct irq_domain_ops *domain_ops;

#ifdef CONFIG_IRQ_DOMAIN_HIERARCHY
	/**
	 * @fwnode:
	 *
	 * Firmware node corresponding to this gpiochip/irqchip, necessary
	 * for hierarchical irqdomain support.
	 */
	struct fwnode_handle *fwnode;

	/**
	 * @parent_domain:
	 *
	 * If non-NULL, will be set as the parent of this GPIO interrupt
	 * controller's IRQ domain to establish a hierarchical interrupt
	 * domain. The presence of this will activate the hierarchical
	 * interrupt support.
	 */
	struct irq_domain *parent_domain;

///......
}

很明显高通针对它这套pinctrl系统,把相关的结构体都封装在了一起,这个girq里记录的就是这个中断控制器tlmm的irq_chip。高通代码有一个很好的点,就是针对这些复杂的结构都会有一个静态的static变量来保存这个,所以我们可以很方便的借助trace32/crash来分析其内部结构。比如该驱动中的

struct msm_pinctrl {
	struct device *dev;
	struct pinctrl_dev *pctrl;
	struct gpio_chip chip;
	struct pinctrl_desc desc;
	struct notifier_block restart_nb;

	int irq;
	int n_dir_conns;

	bool intr_target_use_scm;

	raw_spinlock_t lock;

	DECLARE_BITMAP(dual_edge_irqs, MAX_NR_GPIO);
	DECLARE_BITMAP(enabled_irqs, MAX_NR_GPIO);
	DECLARE_BITMAP(skip_wake_irqs, MAX_NR_GPIO);
	DECLARE_BITMAP(disabled_for_mux, MAX_NR_GPIO);
	DECLARE_BITMAP(ever_gpio, MAX_NR_GPIO);

	const struct msm_pinctrl_soc_data *soc;
	void __iomem *regs[MAX_NR_TILES];
	u32 phys_base[MAX_NR_TILES];

	struct msm_gpio_regs *gpio_regs;
	struct msm_tile *msm_tile_regs;
	bool hibernation;
};

static struct msm_pinctrl *msm_pinctrl_data;

gpiochip_add_data函数

#define gpiochip_add_data(gc, data) ({		\
		static struct lock_class_key lock_key;	\
		static struct lock_class_key request_key;	  \
		gpiochip_add_data_with_key(gc, data, &lock_key, \
					   &request_key);	  \
	})

朝后我就不追了,大家自行追一下代码,追到最后可以到这个函数

static void gpiochip_hierarchy_setup_domain_ops(struct irq_domain_ops *ops)
{
	ops->activate = gpiochip_irq_domain_activate;
	ops->deactivate = gpiochip_irq_domain_deactivate;
	ops->alloc = gpiochip_hierarchy_irq_domain_alloc;

	/*
	 * We only allow overriding the translate() and free() functions for
	 * hierarchical chips, and this should only be done if the user
	 * really need something other than 1:1 translation for translate()
	 * callback and free if user wants to free up any resources which
	 * were allocated during callbacks, for example populate_parent_alloc_arg.
	 */
	if (!ops->translate)
		ops->translate = gpiochip_hierarchy_irq_domain_translate;
	if (!ops->free)
		ops->free = irq_domain_free_irqs_common;
}
static int gpiochip_hierarchy_irq_domain_alloc(struct irq_domain *d,
					       unsigned int irq,
					       unsigned int nr_irqs,
					       void *data)
{
	struct gpio_chip *gc = d->host_data;
	irq_hw_number_t hwirq;
	unsigned int type = IRQ_TYPE_NONE;
	struct irq_fwspec *fwspec = data;
	union gpio_irq_fwspec gpio_parent_fwspec = {};
	unsigned int parent_hwirq;
	unsigned int parent_type;
	struct gpio_irq_chip *girq = &gc->irq;
	int ret;

	/*
	 * The nr_irqs parameter is always one except for PCI multi-MSI
	 * so this should not happen.
	 */
	WARN_ON(nr_irqs != 1);

	ret = gc->irq.child_irq_domain_ops.translate(d, fwspec, &hwirq, &type);   //从trace32可以直接得到为gpiochip_hierarchy_irq_domain_translate函数
	if (ret)
		return ret;

	chip_dbg(gc, "allocate IRQ %d, hwirq %lu\n", irq,  hwirq);

	ret = girq->child_to_parent_hwirq(gc, hwirq, type,
					  &parent_hwirq, &parent_type);
	if (ret) {
		chip_err(gc, "can't look up hwirq %lu\n", hwirq);
		return ret;
	}
	chip_dbg(gc, "found parent hwirq %u\n", parent_hwirq);

	/*
	 * We set handle_bad_irq because the .set_type() should
	 * always be invoked and set the right type of handler.
	 */
	irq_domain_set_info(d,
			    irq,
			    hwirq,
			    gc->irq.chip,
			    gc,
			    girq->handler,
			    NULL, NULL);
	irq_set_probe(irq);

	/* This parent only handles asserted level IRQs */
	ret = girq->populate_parent_alloc_arg(gc, &gpio_parent_fwspec,
					      parent_hwirq, parent_type);
	if (ret)
		return ret;

	chip_dbg(gc, "alloc_irqs_parent for %d parent hwirq %d\n",
		  irq, parent_hwirq);
	irq_set_lockdep_class(irq, gc->irq.lock_key, gc->irq.request_key);
	ret = irq_domain_alloc_irqs_parent(d, irq, 1, &gpio_parent_fwspec);
	/*
	 * If the parent irqdomain is msi, the interrupts have already
	 * been allocated, so the EEXIST is good.
	 */
	if (irq_domain_is_msi(d->parent) && (ret == -EEXIST))
		ret = 0;
	if (ret)
		chip_err(gc,
			 "failed to allocate parent hwirq %d for hwirq %lu\n",
			 parent_hwirq, hwirq);

	return ret;
}

这个就是gpiolib(GPIO 子系统)给“GPIO 作为层级 irq_domain”用的 .alloc 回调

谁调用了gpiolib的alloc回调

当某个外设在 DT 里写了:

  • interrupt-parent = <&tlmm>;

  • interrupts = <gpio_num flags>;(TLMM 常见 #interrupt-cells = <2>

驱动 probe 调 platform_get_irq()/of_irq_get() 时,会走到 irq_create_fwspec_mapping(),最终对 TLMM 的 GPIO irq_domain 触发一次 irq_domain_alloc_irqs_hierarchy(),然后调用到:

d->ops->alloc == gpiochip_hierarchy_irq_domain_alloc

解析fwspec

ret = gc->irq.child_irq_domain_ops.translate(d, fwspec, &hwirq, &type); //从trace32可以直接得到为gpiochip_hierarchy_irq_domain_translate函数
  • fwspec 就是 DT 解析出来的参数(对 TLMM 通常就是 <gpio_num flags>

  • translate() 输出:

    • hwirq子域硬件号,对 TLMM 就是 GPIO 编号

    • type:触发类型(edge/level 等)

子 hwirq 怎么映射到父域的 hwirq/type

ret = girq->child_to_parent_hwirq(gc, hwirq, type,
                                 &parent_hwirq, &parent_type);

这一步非常关键:只有在“层级(hierarchy)模型”下才需要

  • 对 TLMM 这种“正常工作靠 summary IRQ(一个父 IRQ)+ 扫描 TLMM pending 寄存器”的模式,严格来说“每个 GPIO 并没有一个独立的 GIC parent_hwirq”。

  • 但高通平台经常还有 wakeup-parent / MPM 之类的上层唤醒域:某些 GPIO 在低功耗唤醒路径下,确实要映射到“唤醒控制器/MPM”的某个 parent_hwirq。

  • pinctrl-msm.c 里看到的:

    • chip->irq.parent_domain = irq_find_matching_host(np, DOMAIN_BUS_WAKEUP);

    • chip->irq.child_to_parent_hwirq = msm_gpio_wakeirq;

    就是把这一步的函数指针设成 “GPIO → wakeirq(MPM线)” 的映射。

所以这段代码本质是:算出这个 GPIO 子中断,在 parent 域里对应哪个硬件号(以及触发类型)

填充子域

irq_domain_set_info(d, irq, hwirq,
                    gc->irq.chip, gc,
                    girq->handler,
                    NULL, NULL);
irq_set_probe(irq);
  • irq_data->hwirq = hwirq ✅(子域 hwirq:GPIO 编号)

  • irq_data->chip = gc->irq.chip ✅(TLMM 的 irq_chip)

  • irq_data->chip_data = gc

  • irq_desc(virq)->handle_irq = girq->handler ✅(flow handler,先设成 handle_bad_irq,后面 .set_type() 会改成正确的)

中断注册流程

int request_threaded_irq(unsigned int irq, irq_handler_t handler,
			 irq_handler_t thread_fn, unsigned long irqflags,
			 const char *devname, void *dev_id)
{
	struct irqaction *action;
	struct irq_desc *desc;
	int retval;

	if (irq == IRQ_NOTCONNECTED)
		return -ENOTCONN;

	/*
	 * Sanity-check: shared interrupts must pass in a real dev-ID,
	 * otherwise we'll have trouble later trying to figure out
	 * which interrupt is which (messes up the interrupt freeing
	 * logic etc).
	 *
	 * Also shared interrupts do not go well with disabling auto enable.
	 * The sharing interrupt might request it while it's still disabled
	 * and then wait for interrupts forever.
	 *
	 * Also IRQF_COND_SUSPEND only makes sense for shared interrupts and
	 * it cannot be set along with IRQF_NO_SUSPEND.
	 */
	if (((irqflags & IRQF_SHARED) && !dev_id) ||			//如果设置了共享中断,dev_id不能为空
	    ((irqflags & IRQF_SHARED) && (irqflags & IRQF_NO_AUTOEN)) ||
	    (!(irqflags & IRQF_SHARED) && (irqflags & IRQF_COND_SUSPEND)) ||
	    ((irqflags & IRQF_NO_SUSPEND) && (irqflags & IRQF_COND_SUSPEND)))
		return -EINVAL;

	desc = irq_to_desc(irq);	//从中断号获取中断描述符
	if (!desc)
		return -EINVAL;

	if (!irq_settings_can_request(desc) ||	//判断中断描述符是否可以被请求
	    WARN_ON(irq_settings_is_per_cpu_devid(desc)))
		return -EINVAL;

	if (!handler) {	//判断中断处理函数是否为空
		if (!thread_fn)	//如果中断处理函数为空,判断中断线程化处理函数是否为空
			return -EINVAL;
		//如果中断线程化,且中断处理函数为空,走默认的中断处理函数
		handler = irq_default_primary_handler;
	}

	// 分配一个struct irqaction空间
	action = kzalloc(sizeof(struct irqaction), GFP_KERNEL);
	if (!action)
		return -ENOMEM;

	action->handler = handler;
	action->thread_fn = thread_fn;
	action->flags = irqflags;
	action->name = devname;
	action->dev_id = dev_id;

	retval = irq_chip_pm_get(&desc->irq_data);
	if (retval < 0) {
		kfree(action);
		return retval;
	}

	// 设置中断的函数
	retval = __setup_irq(irq, desc, action);

	if (retval) {
		irq_chip_pm_put(&desc->irq_data);
		kfree(action->secondary);
		kfree(action);
	}

#ifdef CONFIG_DEBUG_SHIRQ_FIXME
	if (!retval && (irqflags & IRQF_SHARED)) {
		/*
		 * It's a shared IRQ -- the driver ought to be prepared for it
		 * to happen immediately, so let's make sure....
		 * We disable the irq to make sure that a 'real' IRQ doesn't
		 * run in parallel with our fake.
		 */
		unsigned long flags;

		disable_irq(irq);
		local_irq_save(flags);

		handler(irq, dev_id);

		local_irq_restore(flags);
		enable_irq(irq);
	}
#endif
	return retval;
}

参数

  • irq:中断号,指定请求哪个中断。

  • handler主中断处理函数,通常在硬中断上下文中执行。

  • thread_fn线程化中断处理函数,通常在软中断上下文中执行(如果没有 handler,会使用这个)。

  • irqflags:设置中断的标志(例如是否共享中断、是否启用线程化等)。

  • devname:设备名称(中断的名称,通常显示在 /proc/interrupts)。

  • dev_id:设备 ID,用于区分共享中断的不同设备。

验证输入参数的有效性

if (((irqflags & IRQF_SHARED) && !dev_id) ||			// 如果设置了共享中断,dev_id不能为空
    ((irqflags & IRQF_SHARED) && (irqflags & IRQF_NO_AUTOEN)) ||
    (!(irqflags & IRQF_SHARED) && (irqflags & IRQF_COND_SUSPEND)) ||
    ((irqflags & IRQF_NO_SUSPEND) && (irqflags & IRQF_COND_SUSPEND)))
    return -EINVAL;

这里对 irqflags 参数进行了一些基本验证

  • 如果设置了 IRQF_SHARED(共享中断),必须传递 dev_id,否则会返回 -EINVAL

  • 禁用了自动使能的共享中断(IRQF_NO_AUTOEN)是不合法的。

  • IRQF_COND_SUSPENDIRQF_NO_SUSPEND 是互斥的,这两者不能同时设置。

处理函数检查

if (!handler) { // 判断中断处理函数是否为空
    if (!thread_fn) // 如果中断处理函数为空,判断中断线程化处理函数是否为空
        return -EINVAL;
    handler = irq_default_primary_handler; // 如果线程化,且中断处理函数为空,走默认的处理函数
}

如果 handler 为空,则必须提供一个 thread_fn 作为线程化的中断处理函数;如果两者都为空,返回 -EINVAL 错误。如果只提供了 thread_fn,则将 handler 设置为默认的主中断处理函数 irq_default_primary_handler

irqaction初始化

action->handler = handler;
action->thread_fn = thread_fn;
action->flags = irqflags;
action->name = devname;
action->dev_id = dev_id;

填充 irqaction 结构体,将中断处理函数、线程化处理函数、标志、设备信息等写入。

获取 irq_desc 的电源管理状态

retval = irq_chip_pm_get(&desc->irq_data);
if (retval < 0) {
    kfree(action);
    return retval;
}

调用 irq_chip_pm_get() 获取与 irq_desc 相关的电源管理状态。如果失败,释放分配的内存并返回错误。

设置中断处理函数

retval = __setup_irq(irq, desc, action);

调用 __setup_irq() 来设置 irq_desc,这将把中断的 handler(主处理函数)和 thread_fn(线程化处理函数)与中断号 irq 绑定起来。

__setup_irq() 函数是 Linux 内核中用于 设置中断请求 的核心函数之一,它负责初始化中断处理程序、线程化处理以及对共享和线程化中断的管理。下面我将详细分析函数的每一部分。

static int
__setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
{
	struct irqaction *old, **old_ptr;
	unsigned long flags, thread_mask = 0;
	int ret, nested, shared = 0;

	if (!desc)
		return -EINVAL;

	if (desc->irq_data.chip == &no_irq_chip)
		return -ENOSYS;
	if (!try_module_get(desc->owner))
		return -ENODEV;

	new->irq = irq;

	/*
	 * If the trigger type is not specified by the caller,
	 * then use the default for this interrupt.
	 */
	if (!(new->flags & IRQF_TRIGGER_MASK))
		new->flags |= irqd_get_trigger_type(&desc->irq_data);

	/*
	 * Check whether the interrupt nests into another interrupt
	 * thread.
	 */
	//判断是否是嵌套线程
	nested = irq_settings_is_nested_thread(desc);
	if (nested) {
		if (!new->thread_fn) {
			ret = -EINVAL;
			goto out_mput;
		}
		/*
		 * Replace the primary handler which was provided from
		 * the driver for non nested interrupt handling by the
		 * dummy function which warns when called.
		 */
		new->handler = irq_nested_primary_handler;
	} else {
		// 判断中断设置是否线程化
		if (irq_settings_can_thread(desc)) {
			ret = irq_setup_forced_threading(new);
			if (ret)
				goto out_mput;
		}
	}

	/*
	 * Create a handler thread when a thread function is supplied
	 * and the interrupt does not nest into another interrupt
	 * thread.
	 */
	//如果线程化没有嵌套,则走这里
	if (new->thread_fn && !nested) {
		// 设置中断线程
		ret = setup_irq_thread(new, irq, false);
		if (ret)
			goto out_mput;
		if (new->secondary) {
			ret = setup_irq_thread(new->secondary, irq, true);
			if (ret)
				goto out_thread;
		}
	}

	/*
	 * Drivers are often written to work w/o knowledge about the
	 * underlying irq chip implementation, so a request for a
	 * threaded irq without a primary hard irq context handler
	 * requires the ONESHOT flag to be set. Some irq chips like
	 * MSI based interrupts are per se one shot safe. Check the
	 * chip flags, so we can avoid the unmask dance at the end of
	 * the threaded handler for those.
	 */
	//中断描述符如果支持IRQCHIP_ONESHOT_SAFE,则需要清楚IRQF_ONESHOT,
	// IRQCHIP_ONESHOT_SAFE是一个中断控制器的特定标志,用于指定该控制器
	// 是否支持硬件级别的oneshot中断
	if (desc->irq_data.chip->flags & IRQCHIP_ONESHOT_SAFE)
		new->flags &= ~IRQF_ONESHOT;

	/*
	 * Protects against a concurrent __free_irq() call which might wait
	 * for synchronize_hardirq() to complete without holding the optional
	 * chip bus lock and desc->lock. Also protects against handing out
	 * a recycled oneshot thread_mask bit while it's still in use by
	 * its previous owner.
	 */
	mutex_lock(&desc->request_mutex);

	/*
	 * Acquire bus lock as the irq_request_resources() callback below
	 * might rely on the serialization or the magic power management
	 * functions which are abusing the irq_bus_lock() callback,
	 */
	chip_bus_lock(desc);

	/* First installed action requests resources. */
	// 判断中断描述符的struct irqaction成员是否为空
	if (!desc->action) {
		ret = irq_request_resources(desc); // 用于分配和配置中断相关的资源
		if (ret) {
			pr_err("Failed to request resources for %s (irq %d) on irqchip %s\n",
			       new->name, irq, desc->irq_data.chip->name);
			goto out_bus_unlock;
		}
	}

	/*
	 * The following block of code has to be executed atomically
	 * protected against a concurrent interrupt and any of the other
	 * management calls which are not serialized via
	 * desc->request_mutex or the optional bus lock.
	 */
	raw_spin_lock_irqsave(&desc->lock, flags);
	old_ptr = &desc->action;
	old = *old_ptr;
	if (old) {
		/*
		 * Can't share interrupts unless both agree to and are
		 * the same type (level, edge, polarity). So both flag
		 * fields must have IRQF_SHARED set and the bits which
		 * set the trigger type must match. Also all must
		 * agree on ONESHOT.
		 * Interrupt lines used for NMIs cannot be shared.
		 */
		unsigned int oldtype;

		if (desc->istate & IRQS_NMI) {
			pr_err("Invalid attempt to share NMI for %s (irq %d) on irqchip %s.\n",
				new->name, irq, desc->irq_data.chip->name);
			ret = -EINVAL;
			goto out_unlock;
		}

		/*
		 * If nobody did set the configuration before, inherit
		 * the one provided by the requester.
		 */
		// 中断触发类型如果被设置了
		if (irqd_trigger_type_was_set(&desc->irq_data)) {
			oldtype = irqd_get_trigger_type(&desc->irq_data);
		} else {
			// 中断触发类型如果没有被设置
			oldtype = new->flags & IRQF_TRIGGER_MASK;
			irqd_set_trigger_type(&desc->irq_data, oldtype); //设置中断触发类型
		}

		if (!((old->flags & new->flags) & IRQF_SHARED) ||
		    (oldtype != (new->flags & IRQF_TRIGGER_MASK)) ||
		    ((old->flags ^ new->flags) & IRQF_ONESHOT))
			goto mismatch;

		/* All handlers must agree on per-cpuness */
		if ((old->flags & IRQF_PERCPU) !=
		    (new->flags & IRQF_PERCPU))	//IRQF_PERCPU用于指定中断是否与每个CPU关联
			goto mismatch;

		/* add new interrupt at end of irq queue */
		do {
			/*
			 * Or all existing action->thread_mask bits,
			 * so we can find the next zero bit for this
			 * new action.
			 */
			thread_mask |= old->thread_mask;
			old_ptr = &old->next;
			old = *old_ptr;
		} while (old);
		shared = 1;
	}

	/*
	 * Setup the thread mask for this irqaction for ONESHOT. For
	 * !ONESHOT irqs the thread mask is 0 so we can avoid a
	 * conditional in irq_wake_thread().
	 */
	// 是否设置了中断处理程序只被触发一次后立即禁用
	if (new->flags & IRQF_ONESHOT) {
		/*
		 * Unlikely to have 32 resp 64 irqs sharing one line,
		 * but who knows.
		 */
		if (thread_mask == ~0UL) {
			ret = -EBUSY;
			goto out_unlock;
		}
		/*
		 * The thread_mask for the action is or'ed to
		 * desc->thread_active to indicate that the
		 * IRQF_ONESHOT thread handler has been woken, but not
		 * yet finished. The bit is cleared when a thread
		 * completes. When all threads of a shared interrupt
		 * line have completed desc->threads_active becomes
		 * zero and the interrupt line is unmasked. See
		 * handle.c:irq_wake_thread() for further information.
		 *
		 * If no thread is woken by primary (hard irq context)
		 * interrupt handlers, then desc->threads_active is
		 * also checked for zero to unmask the irq line in the
		 * affected hard irq flow handlers
		 * (handle_[fasteoi|level]_irq).
		 *
		 * The new action gets the first zero bit of
		 * thread_mask assigned. See the loop above which or's
		 * all existing action->thread_mask bits.
		 */
		new->thread_mask = 1UL << ffz(thread_mask);

	} else if (new->handler == irq_default_primary_handler &&
		   !(desc->irq_data.chip->flags & IRQCHIP_ONESHOT_SAFE)) {
		/*
		 * The interrupt was requested with handler = NULL, so
		 * we use the default primary handler for it. But it
		 * does not have the oneshot flag set. In combination
		 * with level interrupts this is deadly, because the
		 * default primary handler just wakes the thread, then
		 * the irq lines is reenabled, but the device still
		 * has the level irq asserted. Rinse and repeat....
		 *
		 * While this works for edge type interrupts, we play
		 * it safe and reject unconditionally because we can't
		 * say for sure which type this interrupt really
		 * has. The type flags are unreliable as the
		 * underlying chip implementation can override them.
		 */
		pr_err("Threaded irq requested with handler=NULL and !ONESHOT for %s (irq %d)\n",
		       new->name, irq);
		ret = -EINVAL;
		goto out_unlock;
	}

	if (!shared) { // 如果没有设置共享中断
		init_waitqueue_head(&desc->wait_for_threads); // 初始化等待队列

		/* Setup the type (level, edge polarity) if configured: */
		if (new->flags & IRQF_TRIGGER_MASK) {
			ret = __irq_set_trigger(desc,
						new->flags & IRQF_TRIGGER_MASK);	// 设置中断触发类型

			if (ret)
				goto out_unlock;
		}

		/*
		 * Activate the interrupt. That activation must happen
		 * independently of IRQ_NOAUTOEN. request_irq() can fail
		 * and the callers are supposed to handle
		 * that. enable_irq() of an interrupt requested with
		 * IRQ_NOAUTOEN is not supposed to fail. The activation
		 * keeps it in shutdown mode, it merily associates
		 * resources if necessary and if that's not possible it
		 * fails. Interrupts which are in managed shutdown mode
		 * will simply ignore that activation request.
		 */
		ret = irq_activate(desc);
		if (ret)
			goto out_unlock;

		desc->istate &= ~(IRQS_AUTODETECT | IRQS_SPURIOUS_DISABLED | \
				  IRQS_ONESHOT | IRQS_WAITING);
		irqd_clear(&desc->irq_data, IRQD_IRQ_INPROGRESS);

		if (new->flags & IRQF_PERCPU) {
			irqd_set(&desc->irq_data, IRQD_PER_CPU);
			irq_settings_set_per_cpu(desc);
			if (new->flags & IRQF_NO_DEBUG)
				irq_settings_set_no_debug(desc);
		}

		if (noirqdebug)
			irq_settings_set_no_debug(desc);

		if (new->flags & IRQF_ONESHOT)
			desc->istate |= IRQS_ONESHOT;

		/* Exclude IRQ from balancing if requested */
		if (new->flags & IRQF_NOBALANCING) {
			irq_settings_set_no_balancing(desc);
			irqd_set(&desc->irq_data, IRQD_NO_BALANCING);
		}

		if (!(new->flags & IRQF_NO_AUTOEN) &&
		    irq_settings_can_autoenable(desc)) {
			irq_startup(desc, IRQ_RESEND, IRQ_START_COND);
		} else {
			/*
			 * Shared interrupts do not go well with disabling
			 * auto enable. The sharing interrupt might request
			 * it while it's still disabled and then wait for
			 * interrupts forever.
			 */
			WARN_ON_ONCE(new->flags & IRQF_SHARED);
			/* Undo nested disables: */
			desc->depth = 1;
		}

	} else if (new->flags & IRQF_TRIGGER_MASK) {
		unsigned int nmsk = new->flags & IRQF_TRIGGER_MASK;
		unsigned int omsk = irqd_get_trigger_type(&desc->irq_data);

		if (nmsk != omsk)
			/* hope the handler works with current  trigger mode */
			pr_warn("irq %d uses trigger mode %u; requested %u\n",
				irq, omsk, nmsk);
	}

	*old_ptr = new;	// 将新的action给中断描述符

	irq_pm_install_action(desc, new);

	/* Reset broken irq detection when installing new handler */
	desc->irq_count = 0;
	desc->irqs_unhandled = 0;

	/*
	 * Check whether we disabled the irq via the spurious handler
	 * before. Reenable it and give it another chance.
	 */
	if (shared && (desc->istate & IRQS_SPURIOUS_DISABLED)) {
		desc->istate &= ~IRQS_SPURIOUS_DISABLED;
		__enable_irq(desc);	//使能中断
	}

	raw_spin_unlock_irqrestore(&desc->lock, flags);
	chip_bus_sync_unlock(desc);
	mutex_unlock(&desc->request_mutex);

	irq_setup_timings(desc, new);	//设置中断控制器的时序参数

	/*
	 * Strictly no need to wake it up, but hung_task complains
	 * when no hard interrupt wakes the thread up.
	 */
	if (new->thread)
		wake_up_process(new->thread);
	if (new->secondary)
		wake_up_process(new->secondary->thread);

	register_irq_proc(irq, desc);	// 注册procfs文件系统
	new->dir = NULL;
	register_handler_proc(irq, new);	// 注册irq文件系统
	return 0;

mismatch:
	if (!(new->flags & IRQF_PROBE_SHARED)) {
		pr_err("Flags mismatch irq %d. %08x (%s) vs. %08x (%s)\n",
		       irq, new->flags, new->name, old->flags, old->name);
#ifdef CONFIG_DEBUG_SHIRQ
		dump_stack();
#endif
	}
	ret = -EBUSY;

out_unlock:
	raw_spin_unlock_irqrestore(&desc->lock, flags);

	if (!desc->action)
		irq_release_resources(desc);
out_bus_unlock:
	chip_bus_sync_unlock(desc);
	mutex_unlock(&desc->request_mutex);

out_thread:
	if (new->thread) {
		struct task_struct *t = new->thread;

		new->thread = NULL;
		kthread_stop(t);
		put_task_struct(t);
	}
	if (new->secondary && new->secondary->thread) {
		struct task_struct *t = new->secondary->thread;

		new->secondary->thread = NULL;
		kthread_stop(t);
		put_task_struct(t);
	}
out_mput:
	module_put(desc->owner);
	return ret;
}

函数总体结构

该函数的工作流程大致分为以下几个部分:

  1. 输入参数检查和初始化:验证中断描述符、模块加载等。

  2. 中断触发类型的设置:确保中断触发类型已正确设置。

  3. 处理中断线程的嵌套问题:判断是否是嵌套线程,如果是,则进行特殊处理。

  4. 为中断创建处理线程:设置和创建中断的线程处理程序。

  5. 配置中断的资源和状态:为中断分配资源,并进行必要的初始化。

  6. 共享中断处理:如果是共享中断,进行进一步的检查和处理。

  7. 中断激活和启动:确保中断在适当的时机激活,并设置相应的时序和状态。

设置触发类型

if (!(new->flags & IRQF_TRIGGER_MASK))
    new->flags |= irqd_get_trigger_type(&desc->irq_data);

如果调用者没有明确指定中断触发类型(IRQF_TRIGGER_MASK),则使用 irqd_get_trigger_type()irq_desc 获取默认的触发类型。

判断是否是嵌套线程

nested = irq_settings_is_nested_thread(desc);
if (nested) {
    if (!new->thread_fn) {
        ret = -EINVAL;
        goto out_mput;
    }
    new->handler = irq_nested_primary_handler;
}

嵌套线程:检查当前中断是否是嵌套中断。嵌套中断的处理会使用一个特殊的处理程序 irq_nested_primary_handler,并且要求必须提供一个线程化处理函数 thread_fn。如果没有提供,则返回 -EINVAL

创建线程化中断处理函数

if (new->thread_fn && !nested) {
    ret = setup_irq_thread(new, irq, false);
    if (ret)
        goto out_mput;
    if (new->secondary) {
        ret = setup_irq_thread(new->secondary, irq, true);
        if (ret)
            goto out_thread;
    }
}
  • 创建中断线程:如果提供了 thread_fn 且没有嵌套线程,那么会调用 setup_irq_thread() 来为该中断创建一个线程。

  • 如果有 secondary 线程,也会为其创建线程。

检查 ONESHOT 中断

if (desc->irq_data.chip->flags & IRQCHIP_ONESHOT_SAFE)
    new->flags &= ~IRQF_ONESHOT;

ONESHOT 安全:如果中断控制器支持硬件级别的 ONESHOT,则清除 IRQF_ONESHOT 标志,以避免软件层面的 ONESHOT 与硬件冲突。

锁定和请求资源

mutex_lock(&desc->request_mutex);
chip_bus_lock(desc);
if (!desc->action) {
    ret = irq_request_resources(desc);
    if (ret) {
        pr_err("Failed to request resources for %s (irq %d) on irqchip %s\n",
               new->name, irq, desc->irq_data.chip->name);
        goto out_bus_unlock;
    }
}

请求资源:使用 mutex_lock 锁住中断描述符,确保对中断资源的独占访问。然后调用 irq_request_resources() 为该中断分配资源。

检查和处理共享中断

raw_spin_lock_irqsave(&desc->lock, flags);
old_ptr = &desc->action;
old = *old_ptr;
if (old) {
    // 共享中断的处理逻辑
}

共享中断:检查该中断是否已经有处理程序。如果已有处理程序,进行共享中断的配置:

  • 确保触发类型、IRQF_SHARED 标志和 IRQF_ONESHOT 一致。

  • 如果共享中断存在,设置 thread_mask 来确保新加入的处理程序能够正确工作。

设置线程掩码

if (new->flags & IRQF_ONESHOT) {
    if (thread_mask == ~0UL) {
        ret = -EBUSY;
        goto out_unlock;
    }
    new->thread_mask = 1UL << ffz(thread_mask);
}

ONESHOT 线程掩码:如果中断设置了 ONESHOT,则为每个 irqaction 分配一个线程掩码,确保该线程只被触发一次。

激活中断并配置触发类型

if (!shared) {
    init_waitqueue_head(&desc->wait_for_threads);
    if (new->flags & IRQF_TRIGGER_MASK) {
        ret = __irq_set_trigger(desc, new->flags & IRQF_TRIGGER_MASK);
        if (ret)
            goto out_unlock;
    }
    ret = irq_activate(desc);
    if (ret)
        goto out_unlock;
}

中断激活:如果中断没有共享,初始化等待队列,并设置中断触发类型(如上升沿、下降沿等)。然后调用 irq_activate() 激活中断。

注册 procfs 文件

register_irq_proc(irq, desc);
register_handler_proc(irq, new);

注册中断到 procfs,方便用户查看和调试。

中断线程的创建

static int
setup_irq_thread(struct irqaction *new, unsigned int irq, bool secondary)
{
	struct task_struct *t;

	if (!secondary) {
		t = kthread_create(irq_thread, new, "irq/%d-%s", irq,
				   new->name);
	} else {
		t = kthread_create(irq_thread, new, "irq/%d-s-%s", irq,
				   new->name);
	}

	if (IS_ERR(t))
		return PTR_ERR(t);

	sched_set_fifo(t);

	/*
	 * We keep the reference to the task struct even if
	 * the thread dies to avoid that the interrupt code
	 * references an already freed task_struct.
	 */
	new->thread = get_task_struct(t);
	/*
	 * Tell the thread to set its affinity. This is
	 * important for shared interrupt handlers as we do
	 * not invoke setup_affinity() for the secondary
	 * handlers as everything is already set up. Even for
	 * interrupts marked with IRQF_NO_BALANCE this is
	 * correct as we want the thread to move to the cpu(s)
	 * on which the requesting code placed the interrupt.
	 */
	set_bit(IRQTF_AFFINITY, &new->thread_flags);
	return 0;
}

创建的内核线程函数irq_thread

static int irq_thread(void *data)
{
	struct callback_head on_exit_work;
	struct irqaction *action = data;
	struct irq_desc *desc = irq_to_desc(action->irq);
	irqreturn_t (*handler_fn)(struct irq_desc *desc,
			struct irqaction *action);

	if (force_irqthreads() && test_bit(IRQTF_FORCED_THREAD,
					   &action->thread_flags))
		handler_fn = irq_forced_thread_fn;
	else
		handler_fn = irq_thread_fn;

	init_task_work(&on_exit_work, irq_thread_dtor);
	task_work_add(current, &on_exit_work, TWA_NONE);

	irq_thread_check_affinity(desc, action);

	while (!irq_wait_for_interrupt(action)) {
		irqreturn_t action_ret;

		irq_thread_check_affinity(desc, action);

		action_ret = handler_fn(desc, action);
		if (action_ret == IRQ_WAKE_THREAD)
			irq_wake_secondary(desc, action);

		wake_threads_waitq(desc);
	}

	/*
	 * This is the regular exit path. __free_irq() is stopping the
	 * thread via kthread_stop() after calling
	 * synchronize_hardirq(). So neither IRQTF_RUNTHREAD nor the
	 * oneshot mask bit can be set.
	 */
	task_work_cancel(current, irq_thread_dtor);
	return 0;
}

循环等待中断并处理 中断的下半部处理函数

static int irq_wait_for_interrupt(struct irqaction *action)
{
	for (;;) {
        // 设置线程状态为可中断睡眠:线程可以接收信号和中断唤醒
		set_current_state(TASK_INTERRUPTIBLE);

		if (kthread_should_stop()) {
            // 如果还有中断标志,处理最后一次
			/* may need to run one last time */
			if (test_and_clear_bit(IRQTF_RUNTHREAD,
					       &action->thread_flags)) {
                // 设置线程状态为running,从睡眠状态转为运行状态
				__set_current_state(TASK_RUNNING);
				return 0;
			}
			__set_current_state(TASK_RUNNING);
			return -1;
		}

		if (test_and_clear_bit(IRQTF_RUNTHREAD,
				       &action->thread_flags)) {
			__set_current_state(TASK_RUNNING);
			return 0;
		}
        // 主动让出CPU:进入睡眠状态等待唤醒
		schedule();
	}
}

中断下半部的中断线程会主动让出CPU,一直处TASK_INTERRUPTIABLE的睡眠状态,并一直等待硬中断唤醒。

中断触发流程

这里贴一下我的同事 @Melokc 绘制的一副中断触发的流程图,其中从gic中断处理到irq_flow_handler处理的流程都详细的标注出了。

PS: 此图的函数基于Linux-6.1的内核,其他内核代码中有部分函数存在差异,但整体的机制是一样的。