糖尿病康复,内容丰富有趣,生活中的好帮手!
糖尿病康复 > Zynq-linux PL与PS通过DMA数据交互

Zynq-linux PL与PS通过DMA数据交互

时间:2021-11-08 13:48:16

相关推荐

Zynq-linux PL与PS通过DMA数据交互

一、目标

在米尔科技的z-turn板上,采用AXI DMA 实现zynq的PS与PL数据交互。

二、分析

①PS数据传PL

驱动中的测试程序中给出一堆数据,通过DMA传输到AXI4-Stream Data FIFO ,PL端从DATA FIFO中把数据读出来。

②PL数据传PS

将PS传入PL的数据回传,在PS端显示出数据,最后将数据乘2再送入DMA。

③PL端代码思路

1)读数据

在加上DATA FIFO的情况下,PL从DATA FIFO中读取数据。将DATA -FIFO的M_AXIS端引出,得到下面的信号。

从上表可以看出,DATA FIFO接收完数据后,想要从FIFO中读取数据,靠的就是上面5根线,FIFO作主机,PL端作从机。所以FIFO会自动将m_axis_tvalid置1,表明可以从主机中读取数据。PL只需要给出回应m_axis_tready置1,便可以在时钟上升沿来临时读取数据。

同时,AXI-DMA将PS数据传输完成后有完成标志mm2s_introut置1。通过这个标准来确定是否从FIFO中读取数据。所以这个信号用于PL中断的触发信号。

2)写数据

将PL读取出来的数据进行乘2,再传入PS,PS再将数据打印出来。PL端接的是AXI-DMA的S_AXIS_S2MM端口。其信号如下:

从信号列表可以看出,此时DMA端口是作从机,PL端口作主机向DMA端口发送数据。PL端想发送数据,通过s_axis_s2mm_tvalid表明有数据发往从机。等待从机响应s_axis_s2mm_tready信号,响应过后便可以发送数据。发送数据时需要将s_axis_s2mm_tkeep拉高,同时当传到最后一个数据时,需要将s_axis_s2mm_tlast置1。

3)整体架构

局部放大图:

三、代码实现

①pl_read.v

module pl_read(clk,rst,m_axis_tvalid,m_axis_tdata,m_axis_tkeep,m_axis_tlast,m_axis_tready,m_ready,m_data,m_datavalid,m_datalast);input clk;input rst;input m_axis_tvalid;input [31:0]m_axis_tdata;input [3:0]m_axis_tkeep;input m_axis_tlast;output m_axis_tready;input m_ready;output [31:0]m_data;output m_datavalid;output m_datalast;reg m_axis_tready;reg [31:0]m_data;reg m_datavalid;reg m_datalast; always@(posedge clk or negedge rst)beginif(!rst)beginm_axis_tready <= 0;m_data <= 0;m_datavalid <= 0;m_datalast <= 0;endelsebeginif(m_ready==1)beginm_axis_tready <= 1;if(m_axis_tvalid==1&&m_axis_tkeep==4'b1111)beginm_data <= m_axis_tdata;m_datavalid <= 1;if(m_axis_tlast==1) m_datalast <= 1;else m_datalast <= 0;endelsebeginm_data <= m_data;m_datavalid <= 0;m_datalast <= 0;endendelsebeginm_axis_tready <= 0;m_data <= m_data;m_datavalid <= 0;m_datalast <= 0;endendendendmodule

②pl_write.v

module pl_write(clk,rst,s_data,s_datavalid,s_datalast,s_ready,s_axis_tdata,s_axis_tkeep,s_axis_tlast,s_axis_tready,s_axis_tvalid);input clk;input rst;input[31:0] s_data;//data streaminput s_datalast;//data last flaginput s_datavalid;//input data valid flagoutput s_ready;//数据可以写入标志input s_axis_tready;output[31:0] s_axis_tdata;output[3:0]s_axis_tkeep;output s_axis_tlast;output s_axis_tvalid;reg [31:0] s_axis_tdata;reg [3:0]s_axis_tkeep;reg s_axis_tlast;reg s_axis_tvalid;reg s_ready;always@(posedge clk or negedge rst)beginif(!rst)begins_axis_tkeep <= 4'b0000;s_axis_tvalid <= 0;s_axis_tdata <= 0;s_axis_tlast<=0;s_ready <= 0; endelsebeginif(s_axis_tready==1)beginif(s_datavalid==1)begins_axis_tvalid <= 1;s_axis_tkeep <= 4'b1111;s_axis_tdata <= s_data;if(s_datalast)s_axis_tlast <= 1;else s_axis_tlast <= 0;endelse begins_axis_tvalid <= 0;s_axis_tkeep <= 4'b0000;s_axis_tdata <= s_axis_tdata;s_axis_tlast<=0;ends_ready <= 1;endelse begins_axis_tkeep <= 4'b0000;s_axis_tvalid <= 0;s_axis_tdata <= s_axis_tdata;s_axis_tlast<=0;s_ready <= 0;endendendendmodule

③top.v

module top();wire clk;wire rst;wire m_axis_tvalid;wire [31:0]m_axis_tdata;wire [3:0]m_axis_tkeep;wire m_axis_tlast;wire m_axis_tready;wire m_ready;wire [31:0]m_data;wire m_datavalid;wire m_datalast;wire s_axis_tready;wire[31:0] s_axis_tdata;wire[3:0]s_axis_tkeep;wire s_axis_tlast;wire s_axis_tvalid;pl_write u1(.clk(clk),.rst(rst),.s_data(m_data),.s_datalast(m_datalast),.s_datavalid(m_datavalid),.s_ready(m_ready),.s_axis_tready(s_axis_tready),.s_axis_tdata(s_axis_tdata),.s_axis_tkeep(s_axis_tkeep),.s_axis_tlast(s_axis_tlast),.s_axis_tvalid(s_axis_tvalid));pl_read u2(.clk(clk),.rst(rst),.m_data(m_data),.m_datalast(m_datalast),.m_datavalid(m_datavalid),.m_ready(m_ready),.m_axis_tready(m_axis_tready),.m_axis_tdata(m_axis_tdata),.m_axis_tkeep(m_axis_tkeep),.m_axis_tlast(m_axis_tlast),.m_axis_tvalid(m_axis_tvalid));system_wrapper u3(.FCLK_CLK0(clk),.peripheral_aresetn(rst),.s_axis_aclk(clk),.s_axis_aclk_1(clk),.s_axis_aresetn(rst),.s_axis_aresetn_1(rst),.s_axis_tready(s_axis_tready),.s_axis_tdata(s_axis_tdata),.s_axis_tkeep(s_axis_tkeep),.s_axis_tlast(s_axis_tlast),.s_axis_tvalid(s_axis_tvalid),.m_axis_tready(m_axis_tready),.m_axis_tdata(m_axis_tdata),.m_axis_tkeep(m_axis_tkeep),.m_axis_tlast(m_axis_tlast),.m_axis_tvalid(m_axis_tvalid));endmodule

④dma驱动代码

#include <linux/module.h>#include <linux/kernel.h>#include <linux/fs.h>#include <linux/device.h>#include <asm/io.h>#include <linux/init.h>#include <linux/platform_device.h>#include <linux/miscdevice.h>#include <linux/ioport.h>#include <linux/of.h>#include <linux/uaccess.h>#include <linux/interrupt.h>#include <asm/irq.h>#include <linux/irq.h>#include <asm/uaccess.h>#include <linux/dma-mapping.h>/***DMA驱动程序*******`** **///DMA 基地址#define DMA_S2MM_ADDR0X40400000#define DMA_MM2S_ADDR 0X40410000//DMA MM2S控制寄存器volatile unsigned int * mm2s_cr;#define MM2S_DMACR0X00000000//DMA MM2S状态控制寄存器volatile unsigned int * mm2s_sr;#define MM2S_DMASR0X00000004//DMA MM2S源地址低32位volatile unsigned int * mm2s_sa;#define MM2S_SA0X00000018//DMA MM2S传输长度(字节)volatile unsigned int * mm2s_len;#define MM2S_LENGTH0X00000028//DMA S2MM控制寄存器volatile unsigned int * s2mm_cr;#define S2MM_DMACR0X00000030//DMA S2MM状态控制寄存器volatile unsigned int * s2mm_sr;#define S2MM_DMASR0X00000034//DMA S2MM目标地址低32位volatile unsigned int * s2mm_da;#define S2MM_DA0X00000048//DMA S2MM传输长度(字节)volatile unsigned int * s2mm_len;#define S2MM_LENGTH0X00000058#define DMA_LENGTH524288dma_addr_t axidma_handle;volatile unsigned int * axidma_addr;static irqreturn_t dma_mm2s_irq(int irq,void *dev_id){printk("\nPs write data to fifo is over! irq=%d\n",irq);iowrite32(0x00001000,mm2s_sr);return IRQ_HANDLED;}static irqreturn_t dma_s2mm_irq(int irq,void *dev_id){iowrite32(0x00001000,s2mm_sr);printk("\nps read data from fifo is over! irq=%d\n",irq);//读出了FIFO里的数据触发中断return IRQ_HANDLED;}int major;static struct class *dma_class = NULL;static int dma_init(void);static int dma_exit(void);static int dma_open(struct inode *inode,struct file *file);static int dma_write(struct file *file,const char __user *buf, size_t count,loff_t *ppos);static int dma_read(struct file *file,char __user *buf,size_t size,loff_t *ppos);/**file_operations 结构数据,沟通内核与操作系统桥梁** */static struct file_operations dma_lops={.owner = THIS_MODULE,.read = dma_read,.open = dma_open,.write = dma_write,};/** 初始化,用于module init** */static int dma_init(void){major=register_chrdev(0,"dma_dev",&dma_lops);dma_class= class_create(THIS_MODULE,"dma_dev");device_create(dma_class,NULL,MKDEV(major,0),NULL,"dma_dev");printk("major dev number= %d",major);mm2s_cr = ioremap(DMA_MM2S_ADDR+MM2S_DMACR, 4);mm2s_sr = ioremap(DMA_MM2S_ADDR+MM2S_DMASR, 4);mm2s_sa = ioremap(DMA_MM2S_ADDR+MM2S_SA, 4);mm2s_len = ioremap(DMA_MM2S_ADDR+MM2S_LENGTH,4);s2mm_cr = ioremap(DMA_S2MM_ADDR+S2MM_DMACR, 4);s2mm_sr = ioremap(DMA_S2MM_ADDR+S2MM_DMASR, 4);s2mm_da = ioremap(DMA_S2MM_ADDR+S2MM_DA, 4);s2mm_len = ioremap(DMA_S2MM_ADDR+S2MM_LENGTH,4);return 0;}/**退出 用于 module exit** */static int dma_exit(void){unregister_chrdev(major,"dma_dev");device_destroy(dma_class,MKDEV(major,0));class_destroy(dma_class);free_irq(dma_mm2s_irq,NULL);free_irq(dma_s2mm_irq,NULL);dma_free_coherent(NULL,DMA_LENGTH,axidma_addr,axidma_handle);iounmap(mm2s_cr);iounmap(mm2s_sr);iounmap(mm2s_sa);iounmap(mm2s_len);iounmap(s2mm_cr);iounmap(s2mm_sr);iounmap(s2mm_da);iounmap(s2mm_len);return 0;}/**open 接口函数** */static int dma_open(struct inode *inode,struct file *file){int err;printk("DMA open\n");axidma_addr = dma_alloc_coherent(NULL,DMA_LENGTH,&axidma_handle,GFP_KERNEL);err = request_irq(61,dma_mm2s_irq,IRQF_TRIGGER_RISING,"dma_dev",NULL);printk("err=%d\n",err);err = request_irq(62,dma_s2mm_irq,IRQF_TRIGGER_RISING,"dma_dev",NULL);printk("err=%d\n",err);return 0;}/** write 接口函数** */static int dma_write(struct file *file,const char __user *buf, size_t count,loff_t *ppos){unsigned int mm2s_status = 0;printk("dma write start !\n");if(count>DMA_LENGTH){printk("the number of data is too large!\n");return 0;}memcpy(axidma_addr,buf,count);iowrite32(0x00001001,mm2s_cr);iowrite32(axidma_handle,mm2s_sa);iowrite32(count,mm2s_len);mm2s_status = ioread32(mm2s_sr);while((mm2s_status&(1<<1))==0){mm2s_status = ioread32(mm2s_sr);}printk("mm2s_status =0x%x\n",mm2s_status);printk("dma write is over!\n");return 0;}/** read 接口函数*DMA读取数据是按照32bit读取的* */static int dma_read(struct file *file,char __user *buf,size_t size,loff_t *ppos){unsigned int s2mm_status=0;printk("dma read start!\n");if(size>DMA_LENGTH){printk("the number of data is not enough!\n");return 1;}iowrite32(0x00001001,s2mm_cr);iowrite32(axidma_handle,s2mm_da);iowrite32(size,s2mm_len);s2mm_status=ioread32(s2mm_sr);while((s2mm_status&(1<<1))==0){s2mm_status=ioread32(s2mm_sr);}printk("s2mm_sr=0x%x\n",s2mm_status);memcpy(buf,axidma_addr,size);printk("\ndma read is over!\n");return 0;}module_init(dma_init);module_exit(dma_exit);MODULE_AUTHOR("TEST@dma");MODULE_DESCRIPTION("dma driver");MODULE_ALIAS("dma linux driver");MODULE_LICENSE("GPL");

⑤测试代码

#include <fcntl.h>#include <stdio.h>#include <unistd.h>#include <sys/types.h>#include <sys/stat.h>void delay(void){int i,j;for(i=0;i<20000;i++)for(j=0;j<10000;j++);}unsigned int readarray[10001];int main(int argc , char ** argv){int fd;int i=0;fd = open("/dev/dma_dev",O_RDWR);if(fd<0) {printf("can not open file\n");while(1);}else printf("open file sucuss\n");delay();for(i=0;i<4000;i++){readarray[i]=i+1;}while(1){write(fd,readarray,4000*4);if(read(fd,readarray,4000*4)==0){for(i=0;i<4000;i++){printf(" %d",readarray[i]);readarray[i]=readarray[i]*2;}printf("\n=====================================\n");printf("======================================\n");}delay();delay();}return 0;}

⑥Makefile文件

KDIR = /home/python/Hard_disk_21G/04-Linux_Source/Kernel/linux-xlnxPWD := $(shell pwd)CC = $(CROSS_COMPILE)gccARCH =armMAKE =makeobj-m:=dma_driver.omodules:$(MAKE) -C $(KDIR) ARCH=$(ARCH) CROSS_COMPLE=$(CROSS_COMPLE) M=$(PWD) modulesclean:make -C $(KDIR) ARCH=$(ARCH) CROSS_COMPLE=$(CROSS_COMPLE) M=$(PWD) clean

⑦运行结果

可以看见程序运行结果,符合预期值。

四、调试

在调试数据交互的过程中,遇到了很多坑。

第一个大坑就是在DATA-FIFO中。数据位宽设置的8bit,PL将测试数据写入DMA,PS端读出数据,发现数据变得大而且只有前面部分数据不为0,后面数据全是0,后来发现是PS端读DMA是按照字节进行读取的,而PS端读取的数据放在整形数组里面,结果导致是PL端的4个数据合成了PS端一个数据,最后导致PL端发送200个数据,结果PS端只有前50个数据有值,后面全是0。后面将PL的位宽设置为32bit,PS端读出数据就正缺了。第二大坑就是忽略了tlast信号。这块问题找了很久,最后通过慢慢测试发现了。当时在PL端发送10万个数据,PS端一次性读取10万个数据没问题,然后想测试读取少一点,结果读取一千,两千,一万个数据都可以,同时数据是正确的。到了第二次循环是就出现问题了,第二次读取的数据还是同第一次读取的数据一样。通过读取DMA状态寄存器的值,打印出来是0x4011,判断出DMA内部错误。后来调试出问题是:在传输期间没有tlast信号,所以报出DMA内部错误。再次调试,设置一万个数据给个tlast,读取2万个数据,最后发现只能读取出前面1万个数据,后面全是0,第二轮还是一样,只有前面1万个有数据,后面全是0。最后明白了DMA传输中在一次传输过程中,只能且必须有一次tlast信号,也不能读出tlast信号后的数据。第三个坑就是在模块设计时,报DMA的S_AXIS_S2MM与FIFO的M_AXIS时钟不匹配,还以为是哪儿没弄对,看别人的都没问题。后来把器件全删了,先把fifo与DMA联接起来,在进行自动连线,问题就解决了。第四个坑就在PL端的例化,连线上。在仿真fifo的时候,先编写一个模块与fifo对应,同时在模块里面对fifo的S_AXIS端连线,在simulation文件中将fifo的M_AXIS连线,功能仿真想观察波形,结果咋调,波形都是xxxx,AXI时序修改了又修改,还是出问题。后来发现是出现了两个fifo,一个是模块里连线的fifo,另一个是simulation里的fifo,所以就出现了波形是xxxx的情况。后面改成了,增加一个top.v在top里面进行各个模块的连线,时钟也不要在单独例化在底层模块里,全部在top里面连线,例化。

五、总结

①每次进行DMA传输是,读取DMA状态寄存器,根据状态寄存器的值,判断是否发生DMA错误。如果发生DMA内部错误,则是由于DMA传输过程中没有收到tlast信号。

②PS是按字节进行读取或写入DMA的,要注意存放数据的数组的变量类型,即整形数组和字符型数组和PL端的位宽匹配好。

如果觉得《Zynq-linux PL与PS通过DMA数据交互》对你有帮助,请点赞、收藏,并留下你的观点哦!

本内容不代表本网观点和政治立场,如有侵犯你的权益请联系我们处理。
网友评论
网友评论仅供其表达个人看法,并不表明网站立场。