Saturday, June 29, 2024
knowledgeMaterialsNewsTechnology

Breaking the memory wall and power consumption wall, the present and future of domestic chip AI-NPU

With the implementation of 5G, the cost-effectiveness of the Internet of Things has become apparent, and the evolution trends such as industrial digitization and urban intelligence are becoming more and more obvious. Improve productivity and production efficiency, reduce costs, and accelerate the construction of new smart cities. It is worth mentioning that digital twin technology has been written into the national “14th Five-Year Plan” to provide national strategic guidance for the construction of digital twin cities.

Author: Liu Jianwei, co-founder of Aixin Yuanzhi

With the implementation of 5G, the cost-effectiveness of the Internet of Things has become apparent, and the evolution trends such as industrial digitization and urban intelligence are becoming more and more obvious. Improve productivity and production efficiency, reduce costs, and accelerate the construction of new smart cities. It is worth mentioning that digital twin technology has been written into the national “14th Five-Year Plan” to provide national strategic guidance for the construction of digital twin cities.

Breaking the memory wall and power consumption wall, the present and future of domestic chip AI-NPU

Regarding digital twins, we can give an example. The unmanned retail concept brick-and-mortar stores launched by Amazon and JD.com in the past few years have turned offline retail stores into online Taobao stores. , Complete the face-swiping login in the settings. After the face authentication is successful, the account can be automatically linked when the face is opened to open the door. After shopping, there is no need to queue up to check out manually, and you can leave only by swiping your face. It seems to be unmanaged, but behind it is the whole process of artificial intelligence tracking. Every move of the consumer is captured by the camera. For example, if you pick up a product and look at it again, it means that you are very interested in this product. However, due to some concerns, I did not buy it, and finally bought another product. Such data will be captured and analyzed in depth to form a basic database. After that, you can cycle through all your shopping records and consumption habits. Sexual push, etc.

Through this example, we can see the convenience brought by digitizing the physical world. Vision is an important means of human perception of the world. The basis for human beings to enter the intelligent society is digitization, perception is the premise of digitizing the physical world, and the type, quantity and quality of front-end visual perception determine the level of intelligence in our society. It can be seen that the foundation of the future of intelligence is “perception + computing”, and AI vision will play a very critical role in the process of intelligence and has a very broad application prospect. Some industry analysts believe that digital twin technology is about to surpass the manufacturing industry and enter the integration fields such as the Internet of Things, artificial intelligence and data analysis. That’s why we chose this entrepreneurial direction.

As the most important entrance from the physical world to the digital twin world, vision chips are receiving widespread attention, especially AI visual perception chips that can restore 80%-90% of the physical world.

So what is an AI visual perception chip? From the perspective of the demand side, the AI ​​visual perception chip needs to have two functions: one is to see clearly, and the other is to understand. The AI-ISP is responsible for seeing clearly, and the AI-NPU is responsible for understanding.

Breaking the memory wall and power consumption wall, the present and future of domestic chip AI-NPU
Figure | Technical characteristics of AI vision chips

In fact, in a broad sense, chips that can achieve AI acceleration in artificial intelligence applications can be called AI chips, and the modules used to improve the efficiency of AI algorithms are often called NPUs (neural network processors). At present, AI vision chips accelerated by NPU have been widely used in the fields of big data, intelligent driving and image processing.

According to the latest data released by IDC, the acceleration server market will reach US$5.39 billion in 2021, a year-on-year increase of 68.6%. Among them, GPU servers dominated with a 90% market share, and non-GPU accelerated servers such as ASIC and FPGA accounted for 11.6% of the market with a growth rate of 43.8%, reaching US$630 million. This means that the application of the neural network processor NPU has moved out of the early pilot stage and is becoming a key requirement in the AI ​​business. Therefore, today we will talk about the AI-NPU responsible for “seeing more clearly” and “understanding”.

Why do you say that seeing more clearly has something to do with AI-NPU? From the perspective of people’s intuitive perception, “seeing clearly” is easy to understand. For example, we want to see things more clearly at night, but the pictures taken by traditional cameras tend to be overexposed and the color details are submerged. There will be noise around people walking at the same time and distant buildings. So, in a situation like this, how can we better achieve “seeing clearly”? In fact, it is the support of AI-NPU’s large computing power that the vision chip needs to “see clearly”.

Breaking the memory wall and power consumption wall, the present and future of domestic chip AI-NPU
Figure | Night video effect comparison chart

Taking smart cities as an example, we are already using 5-megapixel cameras for intelligent analysis. The traditional video quality improvement uses traditional ISP technology. In dark light scenes, there will be a lot of noise. Using AI-ISP can solve this problem and still give clear pictures in dark light scenes. However, using AI-ISP technology, the AI ​​algorithm must be used to process the video at full resolution and frame rate, instead of reducing the resolution or skipping frames by opportunistic methods, because the human eye is very sensitive to the flickering of the picture quality. sensitive. For a 5-megapixel video stream, to achieve full resolution and full frame rate processing, it will place very high requirements on the computing power of the NPU.

In intelligent analysis scenarios, such as vehicle detection and license plate recognition applications, it is common to use 5 million cameras to record video with a frame rate of 30fps, and then perform detection every 3/5 frames. The method of reducing to 720P will not recognize the license plate far away in the video Screen, and may miss the detection of high-speed vehicles. The solution is to try to use full resolution and higher frame rate detection methods for processing. This approach also puts forward very high requirements on the computing power of the NPU.

In addition, as mentioned above, in addition to seeing clearly, we also need to understand. The so-called understanding is to do intelligent analysis, and to do intelligent analysis also needs the support of the large computing power of AI-NPU. Look at this issue from two angles.

First of all, we know that AI itself is a tool to improve efficiency, and it will eventually fall into the scene, which is the concept of early AI+ and recent+AI. So, when AI falls into the industry, what can it do? In fact, AI can do a lot of things. For example, it can replace some industry expert systems with neural networks, which is equivalent to installing such an “expert” into our AI chip. This expert system To be smart enough, it corresponds to a smarter or larger network. A larger network is equivalent to a larger brain capacity. It can maintain and store more weight values, which will place high demands on the NPU computing power.

Secondly, from the perspective of deployment, most of the training of our model is currently run on servers with large computing power, while deployment is on end-side devices with limited computing power. Only the calculation amount of the model or algorithm can be reduced to The degree to which the end side can run can better land on the application side. Therefore, the process of model compression is required, and model compression has high technical requirements for technicians. If our end-to-end computing power is relatively high, in fact, this process can be shortened. This is similar to the process of embedded software development. In the early stage, it was limited by the bottleneck of computing power. In order to be able to run more functions, we need to squeeze the performance of the hardware very seriously, so we use assembly to write programs, but if the computing power compares High, we can use C language for development. In other words, it is feasible to exchange part of the computing power for the improvement of development efficiency and the acceleration of AI landing, but this approach in turn increases the requirements for NPU computing power.

Above, we analyzed the driving force of why AI visual perception chip companies want to develop high-performance and large-power NPUs, but it is very difficult to develop chips with large computing power.

As we all know, computing power is an important indicator of NPU performance. However, the computing power of many early AI chips is actually a nominal value, and the nominal performance cannot be achieved in actual use. For example, the so-called 1T computing power turned out to be only 200G or 3~400G. Therefore, everyone now uses the more practical FPS/W or FPS/$ as an evaluation index to measure the operating efficiency of advanced algorithms on computing platforms.

Breaking the memory wall and power consumption wall, the present and future of domestic chip AI-NPU
Figure | Design difficulties and driving forces of AI-NPU

In the field of autonomous driving, when Tesla released the FSD chip in 2017, Musk compared FSD with the Nvidia Drive PX2 previously used on Tesla, saying: “From a computing power point of view, FSD is the equivalent of Drive PX2. 3 times, but 21 times the FPS when performing self-driving tasks.”

In the field of AI vision chips, the first high-performance, low-power artificial intelligence vision processor chip AX630A released by Aixin Yuanzhi, compared with the running speed of different neural networks under the public data set, the number of frames per second processed are respectively 3116 and 1356, far exceeding other similar chip products, and the power consumption is only about 3W.

Breaking the memory wall and power consumption wall, the present and future of domestic chip AI-NPU
Figure | AX630A product block diagram

What actually opened the gap in these NPU utilization? Behind this is actually the problem of the memory wall and the power consumption wall. The so-called memory wall means that when we increase the computing power index by stacking the MAC unit, the data bandwidth must keep up, otherwise the data supply capacity will be insufficient, which will bring the phenomenon that the MAC unit is constantly waiting for data, and the processing performance will be degraded. The problem of the power wall mainly comes from two aspects: MAC unit and DDR. When we increase the computing power index by stacking MAC units, the total power consumption of the MAC unit itself will increase, and at the same time, high bandwidth support is required. On the server side, a relatively expensive HBM can be used, so that the DDR required Power consumption is bound to rise, and on the end side, due to cost considerations, there is no particularly good DDR solution.

In order to solve the two common problems of the memory wall and the power consumption wall that hinder the implementation of AI, there are two methods commonly used in the industry. One is the integration of storage and computing, but it will be limited by the bottleneck of the process node, and there is still a certain distance from mass production. ; The other is to reduce data handling. Aixin Yuanzhi reduces data handling through mixed precision technology, thereby reducing the obstacles of memory wall and power consumption wall to a certain extent, and improving the efficiency of the entire NPU.

So, how does mixed precision reduce data movement? First of all, we have to clarify the concept of mixed precision – mixed precision is to perform numerical calculations on floating-point numbers/fixed-points of different precisions.

Breaking the memory wall and power consumption wall, the present and future of domestic chip AI-NPU
Figure | Schematic diagram of neural network (simplified version)

As shown in the figure above, each sequence is called a layer layer, the leftmost is the input layer, the rightmost is the output layer, and the middle part is the hidden layer. Each circle in the figure represents a neuron, and each neuron has many links. The number on the link is the weight, which is an important component involved in the calculation. If the weight is a decimal, it means it is a floating point number.

In the entire neural network, the weight coefficient is relatively complex. The data representation format of the traditional NPU is generally 8-bit, 16-bit and floating-point numbers, in order to achieve the precision of the AI ​​algorithm, so the amount of calculation is heavy. However, Aixinyuanzhi found that in practical applications, some information in the AI ​​network is redundant, which means that not all calculations require high-precision floating-point or high-precision 16-bit, but use Low-precision mixing operations such as 8-bit or 4-bit are sufficient.

In the AI-ISP application of Aixin Yuanzhi, it is based on mixed precision technology, and many intermediate layers in the network use INT4 precision. Compared with the original 8-bit network, the amount of data handling may become 1/2 of the original, and the amount of calculation is reduced to 1/4. In this way, the utilization rate and efficiency of the NPU can be improved, and the equivalent computing power of the traditional NPU can be provided several times in a unit area.

Of course, in the process of AI landing, in addition to solving the problem of memory wall and power consumption wall, it is also necessary to consider the combination of algorithm and hardware. Especially on the end side and edge side, the chip is inherently weakly coupled with the scene, so Aixinyuanzhi adopts a joint optimization design from application to algorithm to NPU when designing the AI ​​vision chip.

Breaking the memory wall and power consumption wall, the present and future of domestic chip AI-NPU
Figure | Co-design of algorithm and NPU

Specifically, in traditional AI solutions, algorithms and hardware are usually two independent parts. However, the algorithm team of Aixin Yuanzhi will provide the architect of the NPU design with many detailed information such as the structure of the algorithm network, quantitative requirements/operator requirements, and memory access requirements in the early stage of NPU design. To adjust or optimize the design of the entire NPU according to the needs of the algorithm, so that the efficiency of the algorithm can reach the optimized level. At the same time, the hardware engineer will also provide the algorithm engineer with hardware acceleration conditions such as data flow optimization, storage optimization, and quantization restrictions, so that the algorithm engineer can consider the hardware limitations when designing the algorithm. Angle to avoid some hardware shortcomings. The combination of the two can take into account the hardware and software development of the NPU, and speed up the landing efficiency of AI development.

Based on the above advantages and accumulation, Aixin Yuanzhi has successively launched two generations of four end-side and edge-side AI visual perception chips, AX630A, AX620A, AX620U, and AX170A. Among them, the AX170A can optimize the real-time image quality of 4K 30fps images for mobile phone application scenarios. With the main control chip, it can realize super night scene video and excellent low-light shooting functions, and realize the delicate presentation of high-definition images under low illumination at night; AX620A is designed for Smart city, smart home and other applications can not only achieve excellent image quality in low light environments, but also take into account the superior performance of low power consumption of about 1W, meet the power consumption requirements of battery application solutions, and take into account IoT, smart motion cameras, Mobile phone and other application scenarios; AX630A is aimed at dense scenarios such as smart cities and smart transportation. With powerful dark-light image video processing capabilities and 20-channel 1080p 30fps decoding capabilities, it can integrate high-quality, full-intelligence, full-sensing and real-time analysis capabilities. The advantages are maximized and can easily meet the core demands of customers for “all-weather” and “see clearly”.

Breaking the memory wall and power consumption wall, the present and future of domestic chip AI-NPU
Figure | AX170A

Aixinyuanzhi knows that vertical ecology is the return of AI chips, so while providing chips, it also provides demo boards and other development kits and open source software packages to reduce the development difficulty of users and shorten the development cycle of users.

From the user’s point of view, the use of third-party NPU chips can not only reduce the difficulty of own research and development and obtain sufficient effective AI computing power, but also reduce development costs.

As Dr. Xiaoshen Qiu, founder and CEO of Aixin Yuanzhi once said at the 2021 World Artificial Intelligence Conference: “We hope that we can do our bit to provide more digital and intelligent new infrastructure for the world and provide more on the edge and End-to-end support will bring more profound changes to society.”

Author: Liu Jianwei, co-founder of Aixin Yuanzhi

With the implementation of 5G, the cost-effectiveness of the Internet of Things has become apparent, and the evolution trends such as industrial digitization and urban intelligence are becoming more and more obvious. Improve productivity and production efficiency, reduce costs, and accelerate the construction of new smart cities. It is worth mentioning that digital twin technology has been written into the national “14th Five-Year Plan” to provide national strategic guidance for the construction of digital twin cities.

Breaking the memory wall and power consumption wall, the present and future of domestic chip AI-NPU

Regarding digital twins, we can give an example. The unmanned retail concept brick-and-mortar stores launched by Amazon and JD.com in the past few years have turned offline retail stores into online Taobao stores. , Complete the face-swiping login in the settings. After the face authentication is successful, the account can be automatically linked when the face is opened to open the door. After shopping, there is no need to queue up to check out manually, and you can leave only by swiping your face. It seems to be unmanaged, but behind it is the whole process of artificial intelligence tracking. Every move of the consumer is captured by the camera. For example, if you pick up a product and look at it again, it means that you are very interested in this product. However, due to some concerns, I did not buy it, and finally bought another product. Such data will be captured and analyzed in depth to form a basic database. After that, you can cycle through all your shopping records and consumption habits. Sexual push, etc.

Through this example, we can see the convenience brought by digitizing the physical world. Vision is an important means of human perception of the world. The basis for human beings to enter the intelligent society is digitization, perception is the premise of digitizing the physical world, and the type, quantity and quality of front-end visual perception determine the level of intelligence in our society. It can be seen that the foundation of the future of intelligence is “perception + computing”, and AI vision will play a very critical role in the process of intelligence and has a very broad application prospect. Some industry analysts believe that digital twin technology is about to surpass the manufacturing industry and enter the integration fields such as the Internet of Things, artificial intelligence and data analysis. That’s why we chose this entrepreneurial direction.

As the most important entrance from the physical world to the digital twin world, vision chips are receiving widespread attention, especially AI visual perception chips that can restore 80%-90% of the physical world.

So what is an AI visual perception chip? From the perspective of the demand side, the AI ​​visual perception chip needs to have two functions: one is to see clearly, and the other is to understand. The AI-ISP is responsible for seeing clearly, and the AI-NPU is responsible for understanding.

Breaking the memory wall and power consumption wall, the present and future of domestic chip AI-NPU
Figure | Technical characteristics of AI vision chips

In fact, in a broad sense, chips that can achieve AI acceleration in artificial intelligence applications can be called AI chips, and the modules used to improve the efficiency of AI algorithms are often called NPUs (neural network processors). At present, AI vision chips accelerated by NPU have been widely used in the fields of big data, intelligent driving and image processing.

According to the latest data released by IDC, the acceleration server market will reach US$5.39 billion in 2021, a year-on-year increase of 68.6%. Among them, GPU servers dominated with a 90% market share, and non-GPU accelerated servers such as ASIC and FPGA accounted for 11.6% of the market with a growth rate of 43.8%, reaching US$630 million. This means that the application of the neural network processor NPU has moved out of the early pilot stage and is becoming a key requirement in the AI ​​business. Therefore, today we will talk about the AI-NPU responsible for “seeing more clearly” and “understanding”.

Why do you say that seeing more clearly has something to do with AI-NPU? From the perspective of people’s intuitive perception, “seeing clearly” is easy to understand. For example, we want to see things more clearly at night, but the pictures taken by traditional cameras tend to be overexposed and the color details are submerged. There will be noise around people walking at the same time and distant buildings. So, in a situation like this, how can we better achieve “seeing clearly”? In fact, it is the support of AI-NPU’s large computing power that the vision chip needs to “see clearly”.

Breaking the memory wall and power consumption wall, the present and future of domestic chip AI-NPU
Figure | Night video effect comparison chart

Taking smart cities as an example, we are already using 5-megapixel cameras for intelligent analysis. The traditional video quality improvement uses traditional ISP technology. In dark light scenes, there will be a lot of noise. Using AI-ISP can solve this problem and still give clear pictures in dark light scenes. However, using AI-ISP technology, the AI ​​algorithm must be used to process the video at full resolution and frame rate, instead of reducing the resolution or skipping frames by opportunistic methods, because the human eye is very sensitive to the flickering of the picture quality. sensitive. For a 5-megapixel video stream, to achieve full resolution and full frame rate processing, it will place very high requirements on the computing power of the NPU.

In intelligent analysis scenarios, such as vehicle detection and license plate recognition applications, it is common to use 5 million cameras to record video with a frame rate of 30fps, and then perform detection every 3/5 frames. The method of reducing to 720P will not recognize the license plate far away in the video screen, and may miss the detection of high-speed vehicles. The solution is to try to use full resolution and higher frame rate detection methods for processing. This approach also puts forward very high requirements on the computing power of the NPU.

In addition, as mentioned above, in addition to seeing clearly, we also need to understand. The so-called understanding is to do intelligent analysis, and to do intelligent analysis also needs the support of the large computing power of AI-NPU. Look at this issue from two angles.

First of all, we know that AI itself is a tool to improve efficiency, and it will eventually fall into the scene, which is the concept of early AI+ and recent+AI. So, when AI falls into the industry, what can it do? In fact, AI can do a lot of things. For example, it can replace some industry expert systems with neural networks, which is equivalent to installing such an “expert” into our AI chip. This expert system To be smart enough, it corresponds to a smarter or larger network. A larger network is equivalent to a larger brain capacity. It can maintain and store more weight values, which will place high demands on the NPU computing power.

Secondly, from the perspective of deployment, most of the training of our model is currently run on servers with large computing power, while deployment is on end-side devices with limited computing power. Only the calculation amount of the model or algorithm can be reduced to The degree to which the end side can run can better land on the application side. Therefore, the process of model compression is required, and model compression has high technical requirements for technicians. If our end-to-end computing power is relatively high, in fact, this process can be shortened. This is similar to the process of embedded software development. In the early stage, it was limited by the bottleneck of computing power. In order to be able to run more functions, we need to squeeze the performance of the hardware very seriously, so we use assembly to write programs, but if the computing power compares High, we can use C language for development. In other words, it is feasible to exchange part of the computing power for the improvement of development efficiency and the acceleration of AI landing, but this approach in turn increases the requirements for NPU computing power.

Above, we analyzed the driving force of why AI visual perception chip companies want to develop high-performance and large-power NPUs, but it is very difficult to develop chips with large computing power.

As we all know, computing power is an important indicator of NPU performance. However, the computing power of many early AI chips is actually a nominal value, and the nominal performance cannot be achieved in actual use. For example, the so-called 1T computing power turned out to be only 200G or 3~400G. Therefore, everyone now uses the more practical FPS/W or FPS/$ as an evaluation index to measure the operating efficiency of advanced algorithms on computing platforms.

Breaking the memory wall and power consumption wall, the present and future of domestic chip AI-NPU
Figure | Design difficulties and driving forces of AI-NPU

In the field of autonomous driving, when Tesla released the FSD chip in 2017, Musk compared FSD with the Nvidia Drive PX2 previously used on Tesla, saying: “From a computing power point of view, FSD is the equivalent of Drive PX2. 3 times, but 21 times the FPS when performing self-driving tasks.”

In the field of AI vision chips, the first high-performance, low-power artificial intelligence vision processor chip AX630A released by Aixin Yuanzhi, compared with the running speed of different neural networks under the public data set, the number of frames per second processed are respectively 3116 and 1356, far exceeding other similar chip products, and the power consumption is only about 3W.

Breaking the memory wall and power consumption wall, the present and future of domestic chip AI-NPU
Figure | AX630A product block diagram

What actually opened the gap in these NPU utilization? Behind this is actually the problem of the memory wall and the power consumption wall. The so-called memory wall means that when we increase the computing power index by stacking the MAC unit, the data bandwidth must keep up, otherwise the data supply capacity will be insufficient, which will bring the phenomenon that the MAC unit is constantly waiting for data, and the processing performance will be degraded. The problem of the power wall mainly comes from two aspects: MAC unit and DDR. When we increase the computing power index by stacking MAC units, the total power consumption of the MAC unit itself will increase, and at the same time, high bandwidth support is required. On the server side, a relatively expensive HBM can be used, so that the DDR required Power consumption is bound to rise, and on the end side, due to cost considerations, there is no particularly good DDR solution.

In order to solve the two common problems of the memory wall and the power consumption wall that hinder the implementation of AI, there are two methods commonly used in the industry. One is the integration of storage and computing, but it will be limited by the bottleneck of the process node, and there is still a certain distance from mass production. ; The other is to reduce data handling. Aixin Yuanzhi reduces data handling through mixed precision technology, thereby reducing the obstacles of memory wall and power consumption wall to a certain extent, and improving the efficiency of the entire NPU.

So, how does mixed precision reduce data movement? First of all, we have to clarify the concept of mixed precision – mixed precision is to perform numerical calculations on floating-point numbers/fixed-points of different precisions.

Breaking the memory wall and power consumption wall, the present and future of domestic chip AI-NPU
Figure | Schematic diagram of neural network (simplified version)

As shown in the figure above, each sequence is called a layer layer, the leftmost is the input layer, the rightmost is the output layer, and the middle part is the hidden layer. Each circle in the figure represents a neuron, and each neuron has many links. The number on the link is the weight, which is an important component involved in the calculation. If the weight is a decimal, it means it is a floating point number.

In the entire neural network, the weight coefficient is relatively complex. The data representation format of the traditional NPU is generally 8-bit, 16-bit and floating-point numbers, in order to achieve the precision of the AI ​​algorithm, so the amount of calculation is heavy. However, Aixinyuanzhi found that in practical applications, some information in the AI ​​network is redundant, which means that not all calculations require high-precision floating-point or high-precision 16-bit, but use Low-precision mixing operations such as 8-bit or 4-bit are sufficient.

In the AI-ISP application of Aixin Yuanzhi, it is based on mixed precision technology, and many intermediate layers in the network use INT4 precision. Compared with the original 8-bit network, the amount of data handling may become 1/2 of the original, and the amount of calculation is reduced to 1/4. In this way, the utilization rate and efficiency of the NPU can be improved, and the equivalent computing power of the traditional NPU can be provided several times in a unit area.

Of course, in the process of AI landing, in addition to solving the problem of memory wall and power consumption wall, it is also necessary to consider the combination of algorithm and hardware. Especially on the end side and edge side, the chip is inherently weakly coupled with the scene, so Aixinyuanzhi adopts a joint optimization design from application to algorithm to NPU when designing the AI ​​vision chip.

Breaking the memory wall and power consumption wall, the present and future of domestic chip AI-NPU
Figure | Co-design of algorithm and NPU

Specifically, in traditional AI solutions, algorithms and hardware are usually two independent parts. However, the algorithm team of Aixin Yuanzhi will provide the architect of the NPU design with many detailed information such as the structure of the algorithm network, quantitative requirements/operator requirements, and memory access requirements in the early stage of NPU design. To adjust or optimize the design of the entire NPU according to the needs of the algorithm, so that the efficiency of the algorithm can reach the optimized level. At the same time, the hardware engineer will also provide the algorithm engineer with hardware acceleration conditions such as data flow optimization, storage optimization, and quantization restrictions, so that the algorithm engineer can consider the hardware limitations when designing the algorithm. Angle to avoid some hardware shortcomings. The combination of the two can take into account the hardware and software development of the NPU, and speed up the landing efficiency of AI development.

Based on the above advantages and accumulation, Aixin Yuanzhi has successively launched two generations of four end-side and edge-side AI visual perception chips, AX630A, AX620A, AX620U, and AX170A. Among them, the AX170A can optimize the real-time image quality of 4K 30fps images for mobile phone application scenarios. With the main control chip, it can realize super night scene video and excellent low-light shooting functions, and realize the delicate presentation of high-definition images under low illumination at night; AX620A is designed for Smart city, smart home and other applications can not only achieve excellent image quality in low light environments, but also take into account the superior performance of low power consumption of about 1W, meet the power consumption requirements of battery application solutions, and take into account IoT, smart motion cameras, Mobile phone and other application scenarios; AX630A is aimed at dense scenarios such as smart cities and smart transportation. With powerful dark-light image video processing capabilities and 20-channel 1080p 30fps decoding capabilities, it can integrate high-quality, full-intelligence, full-sensing and real-time analysis capabilities. The advantages are maximized and can easily meet the core demands of customers for “all-weather” and “see clearly”.

Breaking the memory wall and power consumption wall, the present and future of domestic chip AI-NPU
Figure | AX170A

Aixinyuanzhi knows that vertical ecology is the return of AI chips, so while providing chips, it also provides demo boards and other development kits and open source software packages to reduce the development difficulty of users and shorten the development cycle of users.

From the user’s point of view, the use of third-party NPU chips can not only reduce the difficulty of own research and development and obtain sufficient effective AI computing power, but also reduce development costs.

As Dr. Xiaoshen Qiu, founder and CEO of Aixin Yuanzhi once said at the 2021 World Artificial Intelligence Conference: “We hope that we can do our bit to provide more digital and intelligent new infrastructure for the world and provide more on the edge and End-to-end support will bring more profound changes to society.”

View more : IGBT modules | LCD displays | Electronic Components