开发: C++知识库 Java知识库 JavaScript Python PHP知识库人工智能区块链大数据移动开发嵌入式开发工具数据结构与算法开发测试游戏开发网络协议系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑笔记本显卡显示器固态硬盘硬盘耳机手机 iphone vivo oppo 小米华为单反装机图拉丁

-> 系统运维 -> Linux设备驱动开发--- DMA -> 正文阅读

[系统运维]Linux设备驱动开发--- DMA

DMA是计算机系统的一项功能，它允许设备在没有CPU的干预的情况下访问系统主存储器RAM，使CPU完成其他任务。

DMA控制器是负责DMA管理的外设，在现代处理器和微控制器中都能发现它。DMA功能用于执行内存读写和写入操作而不占用CPU周期。当需要传输数据块时，处理器向DMA控制器提供源地址和目标地址以及总字节数。然后，DMA控制器会自动将数据从源传输到目标，而不会占用CPU周期。

1 设置DMA映射

对于任何类型的DMA传输，都需要提供源地址和目标地址，以及需要传输的字数。在外设DMA的情况下，外设的FIFO用作源或目标。

使用外设DMA时，根据传输方向指定源或目标地址，换句话说，DMA传输需要适当的内存映射，这是接下来要讨论的内容。

缓存一致性和DMA

想象一种情况，CPU配备了缓存和外部存储器，设备用DMA可以直接访问它们。当CPU访问内存中X时，当前值将被存储到缓存中，假设采用回写式缓存，对X的后续操作将更新X的缓存副本，而不更新X的外部存储器版本。如果在下一次设备尝试访问X之前，缓存未刷新到内存，则设备将受到旧的X值，同样，如果把新值写入内存时未使缓存的X副本变为无效，那么CPU将操作旧的X值。。

两个独立的设备共享内存就会存在这种问题，缓存一致性确保每个写入操作都是瞬间发生的，共享内存区域的所有设备都会看到完全相同的更改顺序。

DMA映射

所有合适的DMA传输都需要适当的内存映射，DMA映射包括DMA缓冲区和为其生成总线地址。设备实际上使用总线地址。总线地址是dma_addr_t类型的每个实例。

映射分为两种类型：一致DMA映射和流式DMA映射。前者用于多次传输，它会自动解决缓冲一致性问题。流映射有很多限制，他不会自动解决一致性问题，但有一种解决方案，即在每次传输之间调用几个函数。一致DMA映射通常存在于驱动程序的整个生命周期中，而流映射则通常在DMA传输完成时立即取消映射。

处理DMA映射时应该包含的主要头文件如下：

#include <linux/dma-mapping>

一致映射

下面函数设置一致映射：

void *dma_alloc_coherent(struct device *dev,size_t size,dma_addr_t *dma_handle,gfp_t flag);

此函数处理缓冲区的分配和映射，并返回该缓冲区的内核虚拟地址，该缓冲区大小为size字节，可供CPU访问。dev时设备结构。dma_handle指向总线地址，分配给映射的内存物理上一定是连续的，flag决定内存的分配方式，通常是GFP_KERNEL或GFP_ATOMIC。

释放映射可以使用以下函数：

void dma_free_coherent(struct device *dev,size_t size,void *cpu_addr,dma_addr_t dma_handle);

cpu_addr对应dma_alloc_coherent（）返回的内核虚拟地址。这个映射是很宝贵的，它可以分配的最小内存值是一个页面，它分配的页面数量只能是2次幂，对于持续存在于设备生命周期内的缓冲区，应该使用这种映射。

流式DMA映射

流式映射有更多的限制，由于以下原因而不同于一致映射。

映射需要使用已分配的缓冲区
映射可以接受几个分散的不连续缓冲区
映射的缓冲区属于设备而不属于CPU。CPU在使用缓冲区之前，应该首先解除映射，这是为了缓存
对于写入事务，驱动程序应该在映射之前将数据放入缓冲区
必须指定数据移动的方向，只能基于该方向使用数据

为什么在取消映射之前不该访问缓冲区？原因很简单：CPU映射是可缓存的。用于流式映射的dma_map_*()系列函数将首先清理与缓冲区相关的缓存（使之无效），在出现相应的dma_unmap_*()之前，CPU不能访问。

实际上流式映射有两种形式：

单缓存映射，它只允许单页映射。
分散/聚集映射，它允许传递多个缓冲区（缓冲区分散在内存中）

对于这两种中的任何一种映射，都应该用include/linux/dma-direction.h中定义的enum dma_data_direction类型符号来指定方向：

enum dma_data_direction{
	DMA_BIDIRECTIONAL = 0,
	DMA_TO_DEVICE =1,
	DMA_FROM_DEVICE = 2,
	DMA_NONE =3,
};

单缓存区映射
可以使用下面的方法设置单缓存区

dma_addr_t dma_map_single(struct device *dev,void *ptr,size_t size,enum dma_data_direction direction);

ptr是缓冲区的内核虚拟地址，dma_addr_t是设备返回的总线地址，确保使用真正适合需求的方向。
使用下面的函数释放该映射：

void dma_unmap_single(struct device *dev,dma_addr_t dma_addr,size_t size,enum dma_data_direction direction);

分散/聚集映射
分散/聚集映射是一种特殊类型的流式映射，可以在单个槽中传输多个缓冲区区域，而不是单独映射每个缓冲区并逐个传输它们。假设有几个缓冲区物理上是不连续的，所有这些缓存区都需要同时传输到设备或从设备传输。

内核将分散的缓冲区表示为struct scatterlist：

struct scatterlist{
	unsigned long page_link;
	unsigned int offset;
	unsigned int length;
	dma_addr_t dma_address;
	unsigned int dma_length;
};

为了设置分散列表映射，应该进行如下操作：

分配分散的缓冲区
创建分散列表数组，并使用sg_set_buf()分配的内存填充它。
在该分散列表上调用dma_map_sg（）
一旦完成DMA，就调用dma_unmap_sg（）来取消映射分散列表

u32 *wbuf1,*wbuf2,*wbuf3;
/* 分配分散的缓冲区 */
wbuf1 = kzalloc(SDMA_BUF_SIZE,GFP_DMA);
wbuf2 = kzalloc(SDMA_BUF_SIZE,GFP_DMA);
wbuf3 = kzalloc(SDMA_BUF_SIZE/2,GFP_DMA);

/* 创建分散列表数组 */
struct scatterlist sg[3];

/* 使用sg_set_buf()分配的内存填充它 */
sg_init_table(sg,3);
sg_set_buf(&sg[0],wbuf1,SDMA_BUF_SIZE);
sg_set_buf(&sg[1],wbuf2,SDMA_BUF_SIZE);
sg_set_buf(&sg[2],wbuf3,SDMA_BUF_SIZE/2);

ret = dma_map_sg(NULL,sg,3,DMA_MEM_TO_MEM);

dma_map_sg()和dma_unmap_sg()负责缓存一致性。但是，如果需要使用相同的映射来访问DMA传输之间的数据，则必须以适当的方式在每次传输之间同步缓冲区，如果CPU需要访问缓冲区，可以通过dma_sync_sg_for_cpu（）进行同步，如果设备需要，则调用dma_sync_sg_for_device（）进行同步。单区域映射的函数是dma_sync_single_for_cpu（）和dma_sync_single_for_device（）

2 完成的概念

使用完成需要下面的文件：

#include <linux/completion.h>

完成由struct completion表示，可以动态或静态创建。

静态创建生命和初始化如下：

DECLARE_COMPLETION(my_comp);

动态分配如下：

struct completion my_comp;
init_completion(&my_comp);

当驱动程序启动的工作必须等待某件事情完成时，它只需将等待完成事件传递给wait_for_completion（）函数，调用wait_for_completion()任务将会被阻塞

void wait_for_completion(struct completion *comp);

代码的其他部分确定该完成事件发生时，它可以用以下方式唤醒正在等待该事件的进程：

void complete(struct completion *comp);
void cimplete_all(struct completion *comp);

complete只会唤醒一个等待进程，而complete_all（）会唤醒等待该事件的所有进程。

3 DMA引擎API

DMA引擎是开发DMA控制器驱动程序的通用内核框架。DMA的主要目标是在复制内存时减轻CPU的负担。使用通道将I/O数据传输委托给DMA引擎，DMA引擎通过其驱动程序提供一组可供其他设备（从设备）使用的通道。

必须引用头文件如下：

#include <linux/dmaengine.h>

从设备DMA用法很简单，它包含如下步骤：

分配DMA从通道
设置从设备和控制器的特定参数
获取事务的描述符
提交事物
发出挂起的请求并等待回调通知

分配DMA从通道

使用dma_request_channel()请求通道：

struct dma_chan * dma_request_channel(const dma_cap_mask_t *mask,dma_filter_fn fn,void *fn_param);

mask是为了指定驱动程序需要执行的传输类型，表示该通道必须满足的功能，是位图掩码

dma_filter_fn定义为：

typedef bool （*dma_filter_fn)(struct dma_chan *chan,void *filter_param);

如果filter_fn参数为NULL，则dma_request_channel将只返回满足功能掩码的第一个通道。否则，当掩码参数不足以指定所需的通道时，可以使用filter_fn作为系统中可用通道的过滤器。

通过此接口分配的通道由调用者独占，直到调用dma_release_channel（）为止：

void dma_release_channel(struct dma_chan *chan);

设置从设备和控制器指定参数

此步骤引入了新的数据结构struct dma_slave_config，它表示DMA从通道的运行配置。这允许为外设指定设置。

int dmaengine_slave_config(struct dma_chan *chan,struct dma_slave_config *config);

/**
 * struct dma_slave_config - dma slave channel runtime config
 * @direction: whether the data shall go in or out on this slave
 * channel, right now. DMA_MEM_TO_DEV and DMA_DEV_TO_MEM are
 * legal values. DEPRECATED, drivers should use the direction argument
 * to the device_prep_slave_sg and device_prep_dma_cyclic functions or
 * the dir field in the dma_interleaved_template structure.
 * @src_addr: this is the physical address where DMA slave data
 * should be read (RX), if the source is memory this argument is
 * ignored.
 * @dst_addr: this is the physical address where DMA slave data
 * should be written (TX), if the source is memory this argument
 * is ignored.
 * @src_addr_width: this is the width in bytes of the source (RX)
 * register where DMA data shall be read. If the source
 * is memory this may be ignored depending on architecture.
 * Legal values: 1, 2, 3, 4, 8, 16, 32, 64, 128.
 * @dst_addr_width: same as src_addr_width but for destination
 * target (TX) mutatis mutandis.
 * @src_maxburst: the maximum number of words (note: words, as in
 * units of the src_addr_width member, not bytes) that can be sent
 * in one burst to the device. Typically something like half the
 * FIFO depth on I/O peripherals so you don't overflow it. This
 * may or may not be applicable on memory sources.
 * @dst_maxburst: same as src_maxburst but for destination target
 * mutatis mutandis.
 * @src_port_window_size: The length of the register area in words the data need
 * to be accessed on the device side. It is only used for devices which is using
 * an area instead of a single register to receive the data. Typically the DMA
 * loops in this area in order to transfer the data.
 * @dst_port_window_size: same as src_port_window_size but for the destination
 * port.
 * @device_fc: Flow Controller Settings. Only valid for slave channels. Fill
 * with 'true' if peripheral should be flow controller. Direction will be
 * selected at Runtime.
 * @slave_id: Slave requester id. Only valid for slave channels. The dma
 * slave peripheral will have unique id as dma requester which need to be
 * pass as slave config.
 * @peripheral_config: peripheral configuration for programming peripheral
 * for dmaengine transfer
 * @peripheral_size: peripheral configuration buffer size
 *
 * This struct is passed in as configuration data to a DMA engine
 * in order to set up a certain channel for DMA transport at runtime.
 * The DMA device/engine has to provide support for an additional
 * callback in the dma_device structure, device_config and this struct
 * will then be passed in as an argument to the function.
 *
 * The rationale for adding configuration information to this struct is as
 * follows: if it is likely that more than one DMA slave controllers in
 * the world will support the configuration option, then make it generic.
 * If not: if it is fixed so that it be sent in static from the platform
 * data, then prefer to do that.
 */
struct dma_slave_config {
	enum dma_transfer_direction direction;
	phys_addr_t src_addr;
	phys_addr_t dst_addr;
	enum dma_slave_buswidth src_addr_width;
	enum dma_slave_buswidth dst_addr_width;
	u32 src_maxburst;
	u32 dst_maxburst;
	u32 src_port_window_size;
	u32 dst_port_window_size;
	bool device_fc;
	unsigned int slave_id;
	void *peripheral_config;
	size_t peripheral_size;
};

获取事务描述符

在获取DMA通道时，返回值是struct dma_chan结构的实例，它包含struct dma_device*device字段，它就是提供该通道的DMA控制器，此控制器内核驱动提供一组函数来准备DMA事务：

/**
 * struct dma_device - info on the entity supplying DMA services
 * @chancnt: how many DMA channels are supported
 * @privatecnt: how many DMA channels are requested by dma_request_channel
 * @channels: the list of struct dma_chan
 * @global_node: list_head for global dma_device_list
 * @filter: information for device/slave to filter function/param mapping
 * @cap_mask: one or more dma_capability flags
 * @desc_metadata_modes: supported metadata modes by the DMA device
 * @max_xor: maximum number of xor sources, 0 if no capability
 * @max_pq: maximum number of PQ sources and PQ-continue capability
 * @copy_align: alignment shift for memcpy operations
 * @xor_align: alignment shift for xor operations
 * @pq_align: alignment shift for pq operations
 * @fill_align: alignment shift for memset operations
 * @dev_id: unique device ID
 * @dev: struct device reference for dma mapping api
 * @owner: owner module (automatically set based on the provided dev)
 * @src_addr_widths: bit mask of src addr widths the device supports
 *	Width is specified in bytes, e.g. for a device supporting
 *	a width of 4 the mask should have BIT(4) set.
 * @dst_addr_widths: bit mask of dst addr widths the device supports
 * @directions: bit mask of slave directions the device supports.
 *	Since the enum dma_transfer_direction is not defined as bit flag for
 *	each type, the dma controller should set BIT(<TYPE>) and same
 *	should be checked by controller as well
 * @min_burst: min burst capability per-transfer
 * @max_burst: max burst capability per-transfer
 * @max_sg_burst: max number of SG list entries executed in a single burst
 *	DMA tansaction with no software intervention for reinitialization.
 *	Zero value means unlimited number of entries.
 * @residue_granularity: granularity of the transfer residue reported
 *	by tx_status
 * @device_alloc_chan_resources: allocate resources and return the
 *	number of allocated descriptors
 * @device_router_config: optional callback for DMA router configuration
 * @device_free_chan_resources: release DMA channel's resources
 * @device_prep_dma_memcpy: prepares a memcpy operation
 * @device_prep_dma_xor: prepares a xor operation
 * @device_prep_dma_xor_val: prepares a xor validation operation
 * @device_prep_dma_pq: prepares a pq operation
 * @device_prep_dma_pq_val: prepares a pqzero_sum operation
 * @device_prep_dma_memset: prepares a memset operation
 * @device_prep_dma_memset_sg: prepares a memset operation over a scatter list
 * @device_prep_dma_interrupt: prepares an end of chain interrupt operation
 * @device_prep_slave_sg: prepares a slave dma operation
 * @device_prep_dma_cyclic: prepare a cyclic dma operation suitable for audio.
 *	The function takes a buffer of size buf_len. The callback function will
 *	be called after period_len bytes have been transferred.
 * @device_prep_interleaved_dma: Transfer expression in a generic way.
 * @device_prep_dma_imm_data: DMA's 8 byte immediate data to the dst address
 * @device_caps: May be used to override the generic DMA slave capabilities
 *	with per-channel specific ones
 * @device_config: Pushes a new configuration to a channel, return 0 or an error
 *	code
 * @device_pause: Pauses any transfer happening on a channel. Returns
 *	0 or an error code
 * @device_resume: Resumes any transfer on a channel previously
 *	paused. Returns 0 or an error code
 * @device_terminate_all: Aborts all transfers on a channel. Returns 0
 *	or an error code
 * @device_synchronize: Synchronizes the termination of a transfers to the
 *  current context.
 * @device_tx_status: poll for transaction completion, the optional
 *	txstate parameter can be supplied with a pointer to get a
 *	struct with auxiliary transfer status information, otherwise the call
 *	will just return a simple status code
 * @device_issue_pending: push pending transactions to hardware
 * @descriptor_reuse: a submitted transfer can be resubmitted after completion
 * @device_release: called sometime atfer dma_async_device_unregister() is
 *     called and there are no further references to this structure. This
 *     must be implemented to free resources however many existing drivers
 *     do not and are therefore not safe to unbind while in use.
 * @dbg_summary_show: optional routine to show contents in debugfs; default code
 *     will be used when this is omitted, but custom code can show extra,
 *     controller specific information.
 */
struct dma_device {
	struct kref ref;
	unsigned int chancnt;
	unsigned int privatecnt;
	struct list_head channels;
	struct list_head global_node;
	struct dma_filter filter;
	dma_cap_mask_t  cap_mask;
	enum dma_desc_metadata_mode desc_metadata_modes;
	unsigned short max_xor;
	unsigned short max_pq;
	enum dmaengine_alignment copy_align;
	enum dmaengine_alignment xor_align;
	enum dmaengine_alignment pq_align;
	enum dmaengine_alignment fill_align;
	#define DMA_HAS_PQ_CONTINUE (1 << 15)

	int dev_id;
	struct device *dev;
	struct module *owner;
	struct ida chan_ida;
	struct mutex chan_mutex;	/* to protect chan_ida */

	u32 src_addr_widths;
	u32 dst_addr_widths;
	u32 directions;
	u32 min_burst;
	u32 max_burst;
	u32 max_sg_burst;
	bool descriptor_reuse;
	enum dma_residue_granularity residue_granularity;

	int (*device_alloc_chan_resources)(struct dma_chan *chan);
	int (*device_router_config)(struct dma_chan *chan);
	void (*device_free_chan_resources)(struct dma_chan *chan);

	struct dma_async_tx_descriptor *(*device_prep_dma_memcpy)(
		struct dma_chan *chan, dma_addr_t dst, dma_addr_t src,
		size_t len, unsigned long flags);
	struct dma_async_tx_descriptor *(*device_prep_dma_xor)(
		struct dma_chan *chan, dma_addr_t dst, dma_addr_t *src,
		unsigned int src_cnt, size_t len, unsigned long flags);
	struct dma_async_tx_descriptor *(*device_prep_dma_xor_val)(
		struct dma_chan *chan, dma_addr_t *src,	unsigned int src_cnt,
		size_t len, enum sum_check_flags *result, unsigned long flags);
	struct dma_async_tx_descriptor *(*device_prep_dma_pq)(
		struct dma_chan *chan, dma_addr_t *dst, dma_addr_t *src,
		unsigned int src_cnt, const unsigned char *scf,
		size_t len, unsigned long flags);
	struct dma_async_tx_descriptor *(*device_prep_dma_pq_val)(
		struct dma_chan *chan, dma_addr_t *pq, dma_addr_t *src,
		unsigned int src_cnt, const unsigned char *scf, size_t len,
		enum sum_check_flags *pqres, unsigned long flags);
	struct dma_async_tx_descriptor *(*device_prep_dma_memset)(
		struct dma_chan *chan, dma_addr_t dest, int value, size_t len,
		unsigned long flags);
	struct dma_async_tx_descriptor *(*device_prep_dma_memset_sg)(
		struct dma_chan *chan, struct scatterlist *sg,
		unsigned int nents, int value, unsigned long flags);
	struct dma_async_tx_descriptor *(*device_prep_dma_interrupt)(
		struct dma_chan *chan, unsigned long flags);

	struct dma_async_tx_descriptor *(*device_prep_slave_sg)(
		struct dma_chan *chan, struct scatterlist *sgl,
		unsigned int sg_len, enum dma_transfer_direction direction,
		unsigned long flags, void *context);
	struct dma_async_tx_descriptor *(*device_prep_dma_cyclic)(
		struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
		size_t period_len, enum dma_transfer_direction direction,
		unsigned long flags);
	struct dma_async_tx_descriptor *(*device_prep_interleaved_dma)(
		struct dma_chan *chan, struct dma_interleaved_template *xt,
		unsigned long flags);
	struct dma_async_tx_descriptor *(*device_prep_dma_imm_data)(
		struct dma_chan *chan, dma_addr_t dst, u64 data,
		unsigned long flags);

	void (*device_caps)(struct dma_chan *chan,
			    struct dma_slave_caps *caps);
	int (*device_config)(struct dma_chan *chan,
			     struct dma_slave_config *config);
	int (*device_pause)(struct dma_chan *chan);
	int (*device_resume)(struct dma_chan *chan);
	int (*device_terminate_all)(struct dma_chan *chan);
	void (*device_synchronize)(struct dma_chan *chan);

	enum dma_status (*device_tx_status)(struct dma_chan *chan,
					    dma_cookie_t cookie,
					    struct dma_tx_state *txstate);
	void (*device_issue_pending)(struct dma_chan *chan);
	void (*device_release)(struct dma_device *dev);
	/* debugfs support */
	void (*dbg_summary_show)(struct seq_file *s, struct dma_device *dev);
	struct dentry *dbg_dev_root;
};

比如device_prep_dma_memcpy()函数: 准备memcpy操作。其中有些函数都会返回指向struct dma_async_tx_descriptor结构的指针，struct dma_async_tx_descriptor表示我们想操作的事务（执行这些函数）

/**
 * struct dma_async_tx_descriptor - async transaction descriptor
 * ---dma generic offload fields---
 * @cookie: tracking cookie for this transaction, set to -EBUSY if
 *	this tx is sitting on a dependency list
 * @flags: flags to augment operation preparation, control completion, and
 *	communicate status
 * @phys: physical address of the descriptor
 * @chan: target channel for this operation
 * @tx_submit: accept the descriptor, assign ordered cookie and mark the
 * descriptor pending. To be pushed on .issue_pending() call
 * @callback: routine to call after this operation is complete
 * @callback_param: general parameter to pass to the callback routine
 * @desc_metadata_mode: core managed metadata mode to protect mixed use of
 *	DESC_METADATA_CLIENT or DESC_METADATA_ENGINE. Otherwise
 *	DESC_METADATA_NONE
 * @metadata_ops: DMA driver provided metadata mode ops, need to be set by the
 *	DMA driver if metadata mode is supported with the descriptor
 * ---async_tx api specific fields---
 * @next: at completion submit this descriptor
 * @parent: pointer to the next level up in the dependency chain
 * @lock: protect the parent and next pointers
 */
struct dma_async_tx_descriptor {
	dma_cookie_t cookie;
	enum dma_ctrl_flags flags; /* not a 'long' to pack with cookie */
	dma_addr_t phys;
	struct dma_chan *chan;
	dma_cookie_t (*tx_submit)(struct dma_async_tx_descriptor *tx);
	int (*desc_free)(struct dma_async_tx_descriptor *tx);
	dma_async_tx_callback callback;
	dma_async_tx_callback_result callback_result;
	void *callback_param;
	struct dmaengine_unmap_data *unmap;
	enum dma_desc_metadata_mode desc_metadata_mode;
	struct dma_descriptor_metadata_ops *metadata_ops;
#ifdef CONFIG_ASYNC_TX_ENABLE_CHANNEL_SWITCH
	struct dma_async_tx_descriptor *next;
	struct dma_async_tx_descriptor *parent;
	spinlock_t lock;
#endif
};

callback域是操作完成后调用的例程，赋值是一个函数名，

typedef void (*dma_async_tx_callback)(void *dma_async_param);

比如执行完device_prep_dma_memcpy()就会调用callback

提交事务

要将事务放入驱动程序的等待列表中，需要调用dmaengine_submit()。一旦准备好描述符并添加回调信息，就应该将其放在DMA引擎驱动程序等待队列中：

dma_cookie_t dmaengine_submit(struct dma_async_tx_descriptor *desc);

该函数会返回一个cooke，可以通过其他DMA引擎检查DMA活动的进度。dmaengine_submit() 不会启动DMA操作，它只是将其添加到待处理队列中。

发布待处理DMA请求并等待回调通知

启动事务是DMA传输设置的最后一步，在通道上调用dma_async_issue_pending()来激活通道待处理队列中的事务。如果通道空闲，则队列中的第一个事务将启动，后续事务排队等候。DMA操作完成时，队列中的下一个事务启动，并触发软中断(tasklet)。如果已经设置，则该tasklet负责调用客户端驱动程序的完成回调例程进行通知

void dma_async_issue_pending(struct dma_chan*chan);

4 程序

单缓冲区映射

/*
 * Copyright 2006-2014 Freescale Semiconductor, Inc. All rights reserved.
 */

/*
 * The code contained herein is licensed under the GNU General Public
 * License. You may obtain a copy of the GNU General Public License
 * Version 2 or later at the following locations:
 *
 * http://www.opensource.org/licenses/gpl-license.html
 * http://www.gnu.org/copyleft/gpl.html
 */

#include <linux/module.h>
#include <linux/slab.h>
#include <linux/init.h>
#include <linux/dma-mapping.h>
#include <linux/fs.h>
#include <linux/version.h>
#include <linux/platform_data/dma-imx.h>
#include <linux/dmaengine.h>
#include <linux/device.h>
#include <linux/io.h>
#include <linux/delay.h>

static int gMajor; /* major number of device */
static struct class *dma_tm_class;
u32 *wbuf;
u32 *rbuf;

struct dma_chan *dma_m2m_chan;
struct completion dma_m2m_ok;

/*
 * For single mapping, the buffer size does not need to be a multiple of page
 * size.
 *
 */
#define SDMA_BUF_SIZE  1024

static bool dma_m2m_filter(struct dma_chan *chan, void *param)
{
    if (!imx_dma_is_general_purpose(chan))
        return false;
    chan->private = param;
    return true;
}

int sdma_open(struct inode * inode, struct file * filp)
{
    dma_cap_mask_t dma_m2m_mask;
    struct imx_dma_data m2m_dma_data = {0};

    init_completion(&dma_m2m_ok);

    /* Initialize capabilities */
    dma_cap_zero(dma_m2m_mask);
    dma_cap_set(DMA_MEMCPY, dma_m2m_mask);
    m2m_dma_data.peripheral_type = IMX_DMATYPE_MEMORY;
    m2m_dma_data.priority = DMA_PRIO_HIGH;

    /* 1. 从一个DMA控制器分配DMA从通道 */
    dma_m2m_chan = dma_request_channel(dma_m2m_mask, dma_m2m_filter, &m2m_dma_data);
    if (!dma_m2m_chan) {
        pr_err("Error opening the SDMA memory to memory channel\n");
        return -EINVAL;
    } else {
		pr_info("opened channel %d, req lin %d\n", dma_m2m_chan->chan_id, m2m_dma_data.dma_request);
	}

    /* 分配缓存 */
    wbuf = kzalloc(SDMA_BUF_SIZE, GFP_DMA);
    if(!wbuf) {
        pr_err("error wbuf !!!!!!!!!!!\n");
        return -1;
    }

    rbuf = kzalloc(SDMA_BUF_SIZE, GFP_DMA);
    if(!rbuf) {
        pr_err("error rbuf !!!!!!!!!!!\n");
        return -1;
    }

    return 0;
}

int sdma_release(struct inode * inode, struct file * filp)
{
    dma_release_channel(dma_m2m_chan);
    dma_m2m_chan = NULL;
    kfree(wbuf);
    kfree(rbuf);
    return 0;
}

ssize_t sdma_read (struct file *filp, char __user * buf, size_t count,
                                loff_t * offset)
{
    int i;
#if (LINUX_VERSION_CODE >= KERNEL_VERSION(3,0,35))
    for (i=0; i<SDMA_BUF_SIZE/4; i++) {
        if (*(rbuf+i) != *(wbuf+i)) {
            pr_err("Single DMA buffer copy falled!,r=%x,w=%x,%d\n", *(rbuf+i), *(wbuf+i), i);
            return 0;
        }
    }
    pr_info("buffer copy passed!\n");
#endif
    return 0;
}

static void dma_m2m_callback(void *data)
{
    pr_info("in %s\n",__func__);
    complete(&dma_m2m_ok);
    return ;
}

ssize_t sdma_write(struct file * filp, const char __user * buf, size_t count,
                                loff_t * offset)
{
    u32 *index, i;
    struct dma_slave_config dma_m2m_config = {0};
    struct dma_async_tx_descriptor *dma_m2m_desc;
    dma_addr_t dma_src, dma_dst;
    dma_cookie_t cookie;

    index = wbuf;
    for (i=0; i<SDMA_BUF_SIZE/4; i++) {
        *(index + i) = 0x56565656;
    }

    /* 2- Set slave and controller specific parameters */
    dma_m2m_config.direction = DMA_MEM_TO_MEM;  //指定DMA方向
    dma_m2m_config.dst_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES; //指定总线宽度
    dmaengine_slave_config(dma_m2m_chan, &dma_m2m_config);

    /* dma_src = dma_map_single(NULL, wbuf, SDMA_BUF_SIZE, dma_m2m_config.direction); */
    /* dma_dst = dma_map_single(NULL, rbuf, SDMA_BUF_SIZE, dma_m2m_config.direction); */
    /* DMA映射：分配DMA缓冲区，为其生成总线地址 */
    dma_src = dma_map_single(NULL, wbuf, SDMA_BUF_SIZE, DMA_TO_DEVICE);
    dma_dst = dma_map_single(NULL, rbuf, SDMA_BUF_SIZE, DMA_FROM_DEVICE);

    /* 3- Get a descriptor for the transaction. */
    dma_m2m_desc = dma_m2m_chan->device->device_prep_dma_memcpy(dma_m2m_chan, dma_dst, dma_src, SDMA_BUF_SIZE,0);
    dma_m2m_desc->callback = dma_m2m_callback;
    if (dma_m2m_desc)
		pr_info("Got a DMA descriptor\n");
	else
		pr_err("error in prep_dma_sg\n");


    /* 4- Submit the transaction */
    cookie = dmaengine_submit(dma_m2m_desc);
    pr_info("Got this cookie: %d\n", cookie);
 
    /* 5- Issue pending DMA requests and wait for callback notification */
    dma_async_issue_pending(dma_m2m_chan);
    pr_info("waiting for DMA transaction...\n");

    /* One can use wait_for_completion_timeout() also */
    wait_for_completion(&dma_m2m_ok);

    /* dma_unmap_single(NULL, dma_src, SDMA_BUF_SIZE, dma_m2m_config.direction); */
    /* dma_unmap_single(NULL, dma_dst, SDMA_BUF_SIZE, dma_m2m_config.direction); */
    dma_unmap_single(NULL, dma_src, SDMA_BUF_SIZE, DMA_TO_DEVICE);
    dma_unmap_single(NULL, dma_dst, SDMA_BUF_SIZE, DMA_FROM_DEVICE);

    return count;
}

struct file_operations dma_fops = {
    open:       sdma_open,
    release:    sdma_release,
    read:       sdma_read,
    write:      sdma_write,
};

int __init sdma_init_module(void)
{
    struct device *temp_class;
    int error;

    /* register a character device */
    error = register_chrdev(0, "sdma_test", &dma_fops);
    if (error < 0) {
        pr_err("SDMA test driver can't get major number\n");
        return error;
    }
    gMajor = error;
    pr_info("SDMA test major number = %d\n",gMajor);

    dma_tm_class = class_create(THIS_MODULE, "sdma_test");
    if (IS_ERR(dma_tm_class)) {
        pr_err(KERN_ERR "Error creating sdma test module class.\n");
        unregister_chrdev(gMajor, "sdma_test");
        return PTR_ERR(dma_tm_class);
    }

    temp_class = device_create(dma_tm_class, NULL,
                   MKDEV(gMajor, 0), NULL, "sdma_test");

    if (IS_ERR(temp_class)) {
        pr_err(KERN_ERR "Error creating sdma test class device.\n");
        class_destroy(dma_tm_class);
        unregister_chrdev(gMajor, "sdma_test");
        return -1;
    }

    pr_info("SDMA test Driver Module loaded\n");
    return 0;
}

static void sdma_cleanup_module(void)
{
    unregister_chrdev(gMajor, "sdma_test");
    device_destroy(dma_tm_class, MKDEV(gMajor, 0));
    class_destroy(dma_tm_class);

    pr_info("SDMA test Driver Module Unloaded\n");
}

module_init(sdma_init_module);
module_exit(sdma_cleanup_module);

MODULE_AUTHOR("Freescale Semiconductor");
MODULE_AUTHOR("John Madieu <john.madieu@gmail.com>");
MODULE_DESCRIPTION("SDMA test driver");
MODULE_LICENSE("GPL");

应用层执行open函数是会分配一个从DMA通道，并且分配两个缓冲区wbuf，rbuf，wbuf是我们要传输的数据，将数据传输到rbuf
应用层执行write会依次执行下面的步骤：

调用dmaengine_slave_config设置从设备和控制器指定参数
获取事务描述符：dma_m2m_desc ，并指定回调函数，dma_m2m_desc 是通过调用通道(dma_m2m_chan)里面的device域中的函数返回的
调用dmaengine_submit提交事务
调用dma_async_issue_pending函数激活事务
调用wait_for_completion来等待事务完成（完成后会自动调用dma_m2m_desc 的回调函数callback ），会执行complete函数，程序会继续执行下去。

应用层调用read函数会判断传输的数据对不对。

分散聚集映射

/*
 * Copyright 2006-2014 Freescale Semiconductor, Inc. All rights reserved.
 */

/*
 * The code contained herein is licensed under the GNU General Public
 * License. You may obtain a copy of the GNU General Public License
 * Version 2 or later at the following locations:
 *
 * http://www.opensource.org/licenses/gpl-license.html
 * http://www.gnu.org/copyleft/gpl.html
 */

#include <linux/module.h>
#include <linux/slab.h>
#include <linux/init.h>
#include <linux/dma-mapping.h>
#include <linux/fs.h>
#include <linux/version.h>
#include <linux/delay.h>
#include <linux/platform_data/dma-imx.h>
#include <asm/mach/dma.h>
#include <linux/dmaengine.h>
#include <linux/device.h>

#include <linux/io.h>
#include <linux/delay.h>

static int gMajor; /* major number of device */
static struct class *dma_tm_class;
u32 *wbuf, *wbuf2, *wbuf3;
u32 *rbuf, *rbuf2, *rbuf3;

struct dma_chan *dma_m2m_chan;
struct completion dma_m2m_ok;
struct scatterlist sg[3], sg2[3];

/*
 * There is an errata here in the book.
 * This should be 1024*16 instead of 1024
 */
#define SDMA_BUF_SIZE  1024*16



static bool dma_m2m_filter(struct dma_chan *chan, void *param)
{
    if (!imx_dma_is_general_purpose(chan))
        return false;
    chan->private = param;
    return true;
}

int sdma_open(struct inode * inode, struct file * filp)
{
    dma_cap_mask_t dma_m2m_mask;
    struct imx_dma_data m2m_dma_data = {0};

    init_completion(&dma_m2m_ok);

    /* Initialize capabilities */
    dma_cap_zero(dma_m2m_mask);
    dma_cap_set(DMA_MEMCPY, dma_m2m_mask);
    m2m_dma_data.peripheral_type = IMX_DMATYPE_MEMORY;
    m2m_dma_data.priority = DMA_PRIO_HIGH;

    /* 1- Allocate a DMA slave channel. */
    dma_m2m_chan = dma_request_channel(dma_m2m_mask, dma_m2m_filter, &m2m_dma_data);
    if (!dma_m2m_chan) {
        pr_info("Error opening the SDMA memory to memory channel\n");
        return -EINVAL;
    } else {
		pr_info("opened channel %d, req lin %d\n", dma_m2m_chan->chan_id, m2m_dma_data.dma_request);
	}

    wbuf = kzalloc(SDMA_BUF_SIZE, GFP_DMA);
    if(!wbuf) {
        pr_info("error wbuf !!!!!!!!!!!\n");
        return -1;
    }

    wbuf2 = kzalloc(SDMA_BUF_SIZE, GFP_DMA);
    if(!wbuf2) {
        pr_info("error wbuf2 !!!!!!!!!!!\n");
        return -1;
    }

    wbuf3 = kzalloc(SDMA_BUF_SIZE, GFP_DMA);
    if(!wbuf3) {
        pr_info("error wbuf3 !!!!!!!!!!!\n");
        return -1;
    }

    rbuf = kzalloc(SDMA_BUF_SIZE, GFP_DMA);
    if(!rbuf) {
        pr_info("error rbuf !!!!!!!!!!!\n");
        return -1;
    }

    rbuf2 = kzalloc(SDMA_BUF_SIZE, GFP_DMA);
    if(!rbuf2) {
        pr_info("error rbuf2 !!!!!!!!!!!\n");
        return -1;
    }

    rbuf3 = kzalloc(SDMA_BUF_SIZE, GFP_DMA);
    if(!rbuf3) {
        pr_info("error rbuf3 !!!!!!!!!!!\n");
        return -1;
    }

    return 0;
}

int sdma_release(struct inode * inode, struct file * filp)
{
    dma_release_channel(dma_m2m_chan);
    dma_m2m_chan = NULL;
    kfree(wbuf);
    kfree(wbuf2);
    kfree(wbuf3);
    kfree(rbuf);
    kfree(rbuf2);
    kfree(rbuf3);
    return 0;
}

ssize_t sdma_read (struct file *filp, char __user * buf, size_t count,
                                loff_t * offset)
{
    int i;

    for (i=0; i<SDMA_BUF_SIZE/4; i++) {
        if (*(rbuf+i) != *(wbuf+i)) {
            pr_info("buffer 1 copy falled!\n");
            return 0;
        }
    }
    pr_info("buffer 1 copy passed!\n");

    for (i=0; i<SDMA_BUF_SIZE/2/4; i++) {
        if (*(rbuf2+i) != *(wbuf2+i)) {
            pr_err("buffer 2 copy falled!\n");
            return 0;
        }
    }
    pr_info("buffer 2 copy passed!\n");

    for (i=0; i<SDMA_BUF_SIZE/4; i++) {
        if (*(rbuf3+i) != *(wbuf3+i)) {
            pr_info("buffer 3 copy falled!\n");
            return 0;
        }
    }
    pr_info("buffer 3 copy passed!\n");

    return 0;
}

static void dma_m2m_callback(void *data)
{
    pr_info("in %s\n",__func__);
    complete(&dma_m2m_ok);
    return ;
}

ssize_t sdma_write(struct file * filp, const char __user * buf, size_t count,
                                loff_t * offset)
{
    u32 *index1, *index2, *index3, i, ret;
    struct dma_slave_config dma_m2m_config = {0};
    struct dma_async_tx_descriptor *dma_m2m_desc;
    dma_cookie_t cookie;
    struct timeval end_time;
	unsigned long end, start;

    index1 = wbuf;
    index2 = wbuf2;
    index3 = wbuf3;

    for (i=0; i<SDMA_BUF_SIZE/4; i++) {
        *(index1 + i) = 0x12121212;
    }

    for (i=0; i<SDMA_BUF_SIZE/4; i++) {
        *(index2 + i) = 0x34343434;
    }

    for (i=0; i<SDMA_BUF_SIZE/4; i++) {
        *(index3 + i) = 0x56565656;
    }

#if 0
    for (i=0; i<SDMA_BUF_SIZE/4; i++) {
    pr_info("input data_%d : %x\n", i, *(wbuf+i));
    }
    for (i=0; i<SDMA_BUF_SIZE/2/4; i++) {
    pr_info("input data2_%d : %x\n", i, *(wbuf2+i));
    }
    for (i=0; i<SDMA_BUF_SIZE/4; i++) {
    pr_info("input data3_%d : %x\n", i, *(wbuf3+i));
    }
#endif

    /* 2- Set slave and controller specific parameters */
    dma_m2m_config.direction = DMA_MEM_TO_MEM;
    dma_m2m_config.dst_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
    dmaengine_slave_config(dma_m2m_chan, &dma_m2m_config);

    sg_init_table(sg, 3);
    sg_set_buf(&sg[0], wbuf, SDMA_BUF_SIZE);
    sg_set_buf(&sg[1], wbuf2, SDMA_BUF_SIZE);
    sg_set_buf(&sg[2], wbuf3, SDMA_BUF_SIZE);
    ret = dma_map_sg(NULL, sg, 3, dma_m2m_config.direction);

    sg_init_table(sg2, 3);
    sg_set_buf(&sg2[0], rbuf, SDMA_BUF_SIZE);
    sg_set_buf(&sg2[1], rbuf2, SDMA_BUF_SIZE);
    sg_set_buf(&sg2[2], rbuf3, SDMA_BUF_SIZE);
    ret = dma_map_sg(NULL, sg2, 3, dma_m2m_config.direction);

    /* 3- Get a descriptor for the transaction. */
    dma_m2m_desc = dma_m2m_chan->device->device_prep_dma_sg(dma_m2m_chan, sg2, 3, sg, 3, DMA_MEM_TO_MEM);
    dma_m2m_desc->callback = dma_m2m_callback;
    if (dma_m2m_desc)
		pr_info("Got a DMA descriptor\n");
	else
		pr_info("error in prep_dma_sg\n");

    do_gettimeofday(&end_time);
	start = end_time.tv_sec*1000000 + end_time.tv_usec;

    /* 4- Submit the transaction */
    cookie = dmaengine_submit(dma_m2m_desc);
    pr_info("Got this cookie: %d\n", cookie);

    /* 5- Issue pending DMA requests and wait for callback notification */
    dma_async_issue_pending(dma_m2m_chan);
    pr_info("waiting for DMA transaction...\n");


    /* One can use wait_for_completion_timeout() also */
    wait_for_completion(&dma_m2m_ok);
    do_gettimeofday(&end_time);
	end = end_time.tv_sec*1000000 + end_time.tv_usec;
	pr_info("end - start = %d\n", end - start);

    /* Once the transaction is done, we need to  */
    dma_unmap_sg(NULL, sg, 3, dma_m2m_config.direction);
    dma_unmap_sg(NULL, sg2, 3, dma_m2m_config.direction);

    return count;
}

struct file_operations dma_fops = {
    open:       sdma_open,
    release:    sdma_release,
    read:       sdma_read,
    write:      sdma_write,
};

int __init sdma_init_module(void)
{
    struct device *temp_class;

    int error;

    /* register a character device */
    error = register_chrdev(0, "sdma_test", &dma_fops);
    if (error < 0) {
        pr_info("SDMA test driver can't get major number\n");
        return error;
    }
    gMajor = error;
    pr_info("SDMA test major number = %d\n",gMajor);

    dma_tm_class = class_create(THIS_MODULE, "sdma_test");
    if (IS_ERR(dma_tm_class)) {
        pr_info(KERN_ERR "Error creating sdma test module class.\n");
        unregister_chrdev(gMajor, "sdma_test");
        return PTR_ERR(dma_tm_class);
    }

    temp_class = device_create(dma_tm_class, NULL,
                   MKDEV(gMajor, 0), NULL, "sdma_test");

    if (IS_ERR(temp_class)) {
        pr_info(KERN_ERR "Error creating sdma test class device.\n");
        class_destroy(dma_tm_class);
        unregister_chrdev(gMajor, "sdma_test");
        return -1;
    }

    pr_info("SDMA test Driver Module loaded\n");
    return 0;
}

static void sdma_cleanup_module(void)
{
    unregister_chrdev(gMajor, "sdma_test");
    device_destroy(dma_tm_class, MKDEV(gMajor, 0));
    class_destroy(dma_tm_class);

    pr_info("SDMA test Driver Module Unloaded\n");
}

module_init(sdma_init_module);
module_exit(sdma_cleanup_module);

MODULE_AUTHOR("Freescale Semiconductor");
MODULE_AUTHOR("John Madieu <john.madieu@gmail.com>");
MODULE_DESCRIPTION("SDMA test driver");
MODULE_LICENSE("GPL");