用时:33ms

识别行业报告-PDF版

您的当前位置:首页 > 人工智能 > 识别行业
  • 顶象:2022人脸识别安全白皮书(26页).pdf

    业务安全引领者marketingding,iang,人脸识别安全白皮书业务安全引领者marketingding,iang,人脸识别安全白皮书版版权权说说明明本白皮书版权属于北京顶象技术有限公司本白皮书.

    浏览量127人已浏览 发布时间2023-06-08 26页 推荐指数推荐指数推荐指数推荐指数推荐指数5星级
  • 移动生态系统论坛(MEF)Upstream:移动身份认证-后cookie时代的无缝用户身份识别(英文版)(32页).pdf

    Mobile IdentitySeamless user identification in a post cookie worldWHITE PAPER 2022MSISDN|2 White paper|Mobile IdentityWhats insideOverview-HighlightsA new playing field in digital advertisingUpstreams Mobile Identity TechnologyMobile Identity in actionMobile operators have the most to gain Knowing your customer is easier than ever0001020304053612192830|3White paper|Mobile IdentityOverview With the impending demise of third-party cookies,the digital advertising industry is frantically looking for new ways to enable targeted ads that comply with privacy laws.In this context,advertising based on first-party data is becoming more prevalent.Upstreams Mobile Identity hits two birds with one stone;it provides an alternative online identification method to cookies while making the collection of first-party data primarily mobile phone numbers easier than ever.This patented Upstream technology relies on partnerships with Mobile Network Operators(MNOs),creating significant new revenue streams for them and changing their role from being the so-called“dumb pipes”to becoming a vital part of the digital advertising ecosystem.With 3rd party cookies going away,Mobile Identity comes at play.|4 White paper|Mobile IdentityUpstreams Mobile Identity-All you need to know:It identifies users on the open web to engage them through mobile messagingIt uses mobile phone numbers as unique identifiers and is an alternative to third-party cookies(which are being phased out)It makes the collection of first-party data easier for brands and mobile operatorsIt allows seamless user identification and works across all browsers,websites and appsHighlightsKnowing your customerThe opportunity82%of MNOwebsite visitors do not log in1.5x increasein cost savings for brands leveraging first-party data90%of customersfind business messaging annoying if not personally relevantNumbers from the market:$341bn market75%of total global internet advertising is on the mobile60%of the global internet traffic comes from mobile phones88%of web trafficwont be tracked by third-party cookies by H2 2024out of$455 billion in total in 2021|5White paper|Mobile IdentityIt is software-based,straightforward to implement,and scalableIt is secure and complies with stringent privacy laws,such as GDPR,CCPA,LGPD,and POPIAIt allows building data-driven,personalized campaignsIt opens up mobile marketing,user authentication,and other industries to mobile operatorsIt is easy for the user and does not require downloading apps or logging into web pages The benefits of leveraging Mobile Identity:revenue recovered by retargeting abandoned carts10%Up to85%of guest web visitors identifiedincrease in opt in conversions10Xincrease in digital sales20-40%|6 White paper|Mobile Identity01A new playing field in digital advertising|7White paper|Mobile IdentityKnow your customer One of the main questions every business must answer is:who is their target audience,and how does their service or product meet their audiences needs and desires?Marketers have always centered their efforts around the answers to these questions,working on ways to make the brand and its offerings relevant to the needs of customers.And more recently,with the development of digital marketing,the importance of truly understanding the customer has reached a whole new level.New degrees of audience segmentation are now possible,which enable brands to reach the specific people who are most likely to engage with their offerings.Advertising has become much more interactive as a result,and can turn consumers directly into customers.Everything is now measurable,too.This means marketers can test what works best with each audience.All this is driven by user identification,which provides the ability to create the personalized marketing that customers have come to expect,along with high-quality interactivity and seamless user experiences.It also enhances online security,contributing to fraud prevention and making sure every interaction is genuine.In other words,user identification is the holy grail for marketers.But theres change ahead.Who is the target audience?LOCATIONMSISDNSPENDINGPAGE VISITEDWWW.20 GENDERCHANNELAGE|8 White paper|Mobile IdentityCookies:The end of an era is near Until now,user identification has been synonymous with third-party cookies.However,growing privacy concerns about how these cookies track users across the web are leading to their impending demise,reshaping the digital marketing ecosystem.Apples Safari1 and Mozillas Firefox2,which share 23%3 of the total internet browser market,have already eliminated third-party cookies.Google had also announced plans to phase out cookies from its Chrome browser,which represents 65%of the market,in 20234.However,Google has since pushed its plans back to the second half of 20245,allowing the tech giant more time to test its new Privacy Sandbox and giving the market a bit more time to get accustomed to the change.Despite delays,though,theres no question that third-party cookies being eliminated by 88%of the browser market represents the end of an era.Unsurprisingly,the industry is looking for alternatives.The most prevalent solutions are those dependent on a different unique identifier,which most often is the users email address.There are other more probabilistic methods,too,such as contextual targeting.But these can be problematic for brands,as the results they generate are questionable.1 Apple updates Safaris anti-tracking tech with full third-party cookie blocking,The Verge,March 2020 2 Todays Firefox Blocks Third-Party Tracking Cookies and Cryptomining by Default,Mozilla,September 2019 3 Browser Market Share Worldwide,Statcounter,July 2022 4 Googles next big Chrome Update will rewrite the rules of the web,Wired.February 2021 5 Expanding testing for the Privacy Sandbox for the Web”,Google Blog,July 2022 See definitions in computing:AllFoodComputing/kki/(noun)CookieCookie(s)is a small piece of data that is stored in a users browser each time they visit a website.While first-party cookies are placed by the publisher of a website and can be used to improve UX by remembering user preferences and settings,third-party cookies are placed by someone other than the owner of a website(i.e.,a third party)and allow user data to be shared with other parties.Third-party cookies are mostly used to track users across different websites and display relevant ads to them.|9White paper|Mobile IdentitySMSThe opportunity The eventual removal of third-party cookies creates space within the digital advertising ecosystem which is now up for grabs and many companies are competing to get their foot in the door.This is by no means the end of targeting and personalized marketing.Rather,it elevates the role of first-party data.This isnt new:brands have been asking for customers personal details one way or another for a long time,to send offers,news,and other commercial information via personalized communications.Among first-party data,the MSISDN(Mobile Station International Subscriber Directory Number)more often known as a users unique mobile phone number is the most crucial.While up to date the email has been the most prevalent piece of information companies have been asking from potential clients,the mobile number is gaining traction,as mobile messaging has been proven to be the most effective method of engaging with users.According to Gartner6,SMS has a 98%open rate compared to just 20%for email,and a 45%response rate versus 6%for email.The rise of channels such as RCS and OTT messaging platforms such as WhatsApp and Viber,which bring the functionalities of the internet into the mobile messaging world,also contribute to how the unique mobile number of each user has become an important asset for brands.6 The Future of Sales Follow-Ups:Text Messages,Gartner,October 2019 Emailvsvs98E%Open rateResponse rate20%6%Open rateResponse rateMobile messaging is more engaging.|10 White paper|Mobile Identity7 The Cookieless World:A Guide for the New Era of Digital Marketing,Dentsu,July 2021 8 Data experience:The data-driven strategy behind business growth,Experian,May 2021 9 5 keys to creating value with first-party data,Think With Google,April 202110 Attitude to personalization among internet users in the United States as of January 2019,Statista Research Department,February 2021 11 How Your Customers Expectations Have Changed in the Age of the Customer,Salesforce,July 2017 12 Responsible Marketing With First-Party Data In Asia Pacific A$200 Billion Value Unlock Opportunity,The Boston Consultin Group,May 2020First-party data is key for marketers as it allows them to:Build trust91%of consumers are concerned about the amount of data companies collect about them7.#1Have seamless UXWith first-party data such as a mobile phone number or email address serving as a Universal Unique Identifier(UUID),brands can simplify customer journeys,create a better user experience,and achieve more conversions.#6Increase revenuesAt the end of the day,every brands goal is to increase revenues.Companies using first-party data in advanced marketing activations can achieve up to 3x higher revenue uplift12.#7Personalize content90%of customers find messages from companies annoying if they are not relevant10,while 70%say a companys understanding of their personal needs affects their loyalty11.#5Comply with privacy regulationsSecuring customer consent before collecting any user data and being fully compliant with regulations such as GDPR.#3Reach the right audience every time94%of businesses say the quality of contact data has become more important recently8.#2Reduce costsGathering data directly rather than purchasing it from other companies cuts costs.Brands leveraging first-party data increase their cost savings by 1.5x9.#4|11White paper|Mobile Identity13 Privacy personalization:How APAC brands can responsibly unlock the full value of first-party data,Think With Google,May 2020 First-party data has traditionally been harder to acquire than third-party data.In fact,62%of brands cite an inability to integrate the necessary technologies as the primary barrier to leveraging first-party data13.So,why isnt first-party data-based marketing already the top priority for marketers?The answer is simple:|12 White paper|Mobile Identity02Upstreams Mobile Identity Technology|13White paper|Mobile IdentityHow it works Upstreams Mobile Identity technology makes the collection of first-party data easy for marketers.It enables brands to transform anonymous mobile users on the open web into identified marketing prospects.It is estimated that the proportion of guest web visitors identified this way can be as high as 85%.Mobile Indentity enables this by leveraging a unique asset mobile network operators have at their disposal(the MSISDN)which functions as a global unique marker of each users identity.Mobile identification is a solution that raises the role of mobile network operators as valuable partners in the emerging post-cookie digital ecosystem.The technology works as a black box requiring limited input from operators themselves.The Mobile Identity solution is strictly software-based and is implemented by Upstream.It requires minimal integration just a few days of a network engineers time and no upfront investment.All that is needed is a single configuration on the back end.With Upstreams Mobile Identity,mobile phone numbers become unique user identifiers.Identification takes place over the mobile network(mobile data traffic),which accounts for between 50 and 95%of the total internet traffic(depending on the country).|14 White paper|Mobile IdentityThe solution is designed to work on any website and app.All thats required is a single,straightforward configuration on the back end.Once this is done,the website will be able to identify users seamlessly and continuously.The end-user doesnt need to do anything except opt in,because their profile is linked to their MSISDN.The technology will work with new mobile implementations,5G rollouts,and new web features,making it future proof.Upstreams Mobile Identity technology is innovative in terms of the way it collects the users mobile phone number.Until now,if a brand wanted to acquire this data,it would have to ask the user to fill in their details via a form.However,Upstreams user identification technology can instantly recognize the MSISDN of a user visiting a website and auto-fill their details,making the collection of first-party data easier and more user-friendly than ever.The only thing required is for the user to give their consent via a simple tick box.Our Mobile Identity feature is also integrated into Grow,Upstreams marketing automation platform.Brands who use it can directly act on the data they collect,running automated campaigns across a wide range of mobile channels.MNOs apply a one-off Network configurationMNOWebsites/Apps apply the Upstream tech on themselves,opening up the functionalityUpstream enables Mobile Identification via its patented technology|15White paper|Mobile IdentityWhy Mobile Identity is cool Upstreams Mobile Identity brings a plethora of benefits to brands that adopt the solution:Cross functionality Mobile-centricThe majority of internet traffic(60.6%)comes from mobile phones14.The solution is designed to work on any website,portal,and application.As Upstreams Mobile Identity operates at the network level,this means it can work across all browsers.In contrast,cookies work on the browser level,which means that they only work on browsers that still enable third-party cookie tracking and each time the user is on a different browser,they must go through a new opt in process.Mobile marketing has emerged as the largest piece of the digital marketing pie.In 2021,mobile ad spend stood at$341 billion 75%of the total digital advertising market15.As an advanced user identification solution,our Mobile Identity is designed to work on any browser,website and app.This is a mobile-centric solution at a time when:341bn2021$Mobile ad spend14 Desktop vs Mobile vs Tablet Market Share Worldwide,Statcounter,July 2022 15 Worldwide Digital Ad Spending 2021,Insider Intelligence&eMarketer,April 2021|16 White paper|Mobile IdentityDeterministic identificationEstablished using first-party data,Mobile Identity is a deterministic method,based on real data the customer shares during their actual interaction with a specific brand.This means there are no doubts regarding its accuracy,in contrast to probabilistic methods such as contextual marketing where there is a significant margin for error.First-party data collection made easyThe technologys ability to automatically fill-in user details can increase opt in rates ten-fold16.With the users consent,the brand can also access detailed customer profile information linked to their unique MSISDN.This creates new possibilities for targeting users and serving personalized content,improving customer engagement and conversions.10 xUp toincrease in opt in conversions with Mobile Identity16 Upstream proprietary data|17White paper|Mobile IdentityMaking data work immediatelySeamless user experienceThe inability to integrate technologies is not an issue with Mobile Identity as it is part of Grow,Upstreams mobile marketing automation platform.The technology can directly leverage Upstreams Mobile Identity offers all the convenience of an app without the hassle of having to download anything or log in,simplifying user journeys.Customers just have to opt in once and then they can continuously enjoy the benefits of personalized,targeted content that is first-party data to manage different audiences,reach them through a wide variety of mobile messaging and digital channels,and automate communications according to the users interactions.aligned to their preferences.Major MNOs from highly populated countries have more than 10 million website visitors every month.82%of these visitors dont log in17,meaning sites cant currently identify them effectively.Data integration is no issue for Mobile Identity technology,as it is part of Upstreams mobile marketing automation platform.How to remember users even when they dont log in on your website.17 Upstream proprietary data|18 White paper|Mobile IdentityPersonalizationCompanies can build sophisticated user profiles from MSISDN-linked data,using geography,habits,and history.This can be used to provide personalized messages to customers and avoid spamming them with irrelevant information.Privacy and security at the heart of our Mobile Identity technologyUpstreams Mobile Identity complies with stringent privacy regulations around the world,such as GDPR,CCPA,LGPD,POPIA etc.Users must opt in to be identified.The first-party data gathered is only used by the brand collecting it and not shared with any third parties.Users always have a choice to opt out and stop receiving communications if they are no longer interested in a brands offers.The patent is also inherently secure by design,with all information encrypted end-to-end.If hackers conducted a successful man-in-the-middle attack,any data they accessed would be useless,protecting users information.No personal identifiable information(PII)can be communicated to third parties and the identification takes place under HTTPS protocol.All personal information is encrypted end-to-end.|19White paper|Mobile Identity03Mobile Identity in action|20 White paper|Mobile IdentityUse cases 1.Security|Silent mobile verificationMany mobile applications request that users fill in their mobile phone number and send them a message afterwards,usually via SMS,containing a one-time password(OTP)that the user must submit to verify their identity.Upstreams Mobile Identity simplifies this process and the whole user experience.Users only have to give their consent to be identified,then the verification process takes place automatically and immediately.18 Upstream proprietary dataConvert and sell with Mobile Identity Upstreams Mobile Identity solution can be applied to the needs of businesses across various industries and markets,from mobile operators to retail companies,app publishers,insurance companies,financial institutions,and more.With Mobile Identity,marketers can make highly customized offers using available data about the end user,allowing them to deliver a true omnichannel experience.The potential uplift in digital sales is estimated between 20%and 40.20-30%of users drop-off the verification process when an OTP password is required due to the effort involved.Or simply because the message had never been delivered.|21White paper|Mobile IdentitySeamless verification in a taxi service appA user downloads an app,which needs to validate their information.Then,they are asked to fill their MSISDN via a pop-up.Upstream performs a silent mobile verification in the background to check the validity of the phone number.The authentication is completed,and the user is instantly able to use the app.START NOWWelcome on boardStart a ride11:20 AM100:20 AM100%Please fill in your phone numberCONTINUE*369811:20 AM100%Please wait.as we are checking the validity of your number50%|22 White paper|Mobile Identity11:20 AM100%Autumn fashion now in!Shop the trends11:20 AM100%Buy now!11:20 AM100ck to School outfits your kids will love!BUY NOW!up to 40%OFFPersonalized retail contentWhen a retail customer visits a website,they can be served relevant ads based on their profile.For example,a student might receive ads for stationery and computer supplies,while a parent will see ads for childrens clothing.Personalize what they see,based on their profileFashionistasStudentsParents2.User experience|Fast,seamless and personalized customer journeysThe identification technology simplifies user journeys by removing lengthy log in and authentication processes to improve user engagement and conversions.Users dont need to log in at all because the website recognizes them if they have already given their consent once.This means that companies can instantly customize the content of their website based on the users visiting it.The content of the page will therefore be more relevant to the visitor,leading to better engagement and more conversions.Audience reporting is offered,too,providing marketers with insights based on aggregated data from users.Moreover,users dont have to go through any sign up and sign in process before they make their purchases.This is particularly important when you consider 24%of users abandon their carts before checking out when they are asked to create an account19.19 48 Cart Abandonment Rate Statistics,Baymard Insitiute,2022|23White paper|Mobile IdentityMobile Number Portability(MNP)acquisitionsA customer of one MNO visits another MNOs website.The website,using Upstreams solution,flags the user is coming from another mobile network and shows them a pop-up,inviting them to join the new network by offering a deal.When the user clicks the pop-up,their network migration process begins.Personalized upsellingAn existing high-value customer of an MNO browses to the operators website.The website recognizes them immediately and upsells a new package,increasing the customers lifetime value.Chris is a user from another MNO networkbrowsing your web pageHe gets a pop-up notification to switch networkHe clicks on the pop up and the migration process beginsA pre-paid user is browsing the MNOs web pageWith Mobile Online Identification we identify the userWe target them with a personalized ad to upgrade their plan11:20 AM100%Ready to pay less for more?Postpaid plan when you switch to our network.Enjoy 20 OFF%YES PLEASE!MNO Enjoy our products and servicestailored to your needs.JOIN US NOWFantastic choice!We want you!11:20 AM100%Hi LisaQuick AccessMy ServicesGamesHot DealsMy PlanMy AccountHomeTop-upPlayStoresMoreMNONever-ending browsing!Upgrade Now25GB data unlimited callsunlocked with our new post paid plan!No Upgrade fees11:20 AM100%Hi LisaGreat news!Get 20GB with only$8/monthLearn moreQuick AccessMy ServicesGamesHot DealsMy PlanHomeTop-upPlayStoresMoreMy AccountMNOMSISDN Online Identification11:20 AM100%|24 White paper|Mobile IdentityBuy now,pay later offersA user searches for a new mobile phone in a mobile operators website.The operator can instantly recognize that this is a high value customer and offer them micro-credit for a premium smartphone purchase in installments.The offer is showcased at the top of the page,rather than as a payment method at the end of the purchase process.This way the user is prompted to make a higher value purchase.MSISDN Online Identification11:20 AM100%My Operator0 Love the price!unlimited calls10GB Love the plan.Get it now!Get it now!XIAOMI 11T Pro Dual11:20 AM100%My Operator0 Love the price!unlimited calls10GB Love the plan.Get it now!Get it now!XIAOMI 11T Pro DualCongratulations!PurchaseGet the New Data Plan today&pay your device off in 12 interest free installments!16GBNEW DATA PLAN|25White paper|Mobile IdentityIncreasing lead generation with multi-channel retargetingA potential new customer visits a car insurance website but leaves before filling in the form to get a quote.The insurance company sends them an automated personalized RCS message prompting them to return and complete the form.3.Marketing|Advanced omnichannel retargetingWith this patent,mobile operators and other businesses can use the MSISDN as a mobile advertising ID.With the Mobile Identity being a part of Upstreams mobile multi-channel marketing automation platform,Grow,they can recognize users across online and offline channels,and across different browsers and apps,allowing push retargeting.This enables marketers to re-engage mobile users who have fallen out of the funnel via another channel,avoiding messaging fatigue.90%increase in campaign efficiency with automated same-day retargeting 47%reduction in messaging spam20 Automated same-day retargeting through Upstreams Grow platformUser fills in a car insurance mobile form to get a quoteGets a retargeting message to return to the formUser returns and fills in the formUser drops off20 later20 Upstream proprietary data11:20 AM100%Insurance companyType your replyWhat do you drive?RenaultWelcome!Tell us a bit about your car:Year of registration?2015The model?Engine size and type?Next11:20 AM100%Insurance companyType your replyWhat do you drive?RenaultWelcome!Tell us a bit about your car:Year of registration?2015The model?Twingo Alize Engine size and type?1000 cc PetrolNext11:20 AM100%Insurance companyType your replyGet a quoteDid you forget something?We offer the best car insurance solution that promptly answers your needs.Complete the form and we will get the right quote for you.Hello!Weve got you covered!|26 White paper|Mobile IdentityDecreasing cart abandonment with multichannel retargetingA customer visits an e-shop,and adds a TV to their online shopping cart,but does not checkout.They receive an automated SMS with a clickable link back to their shopping cart so they can return and complete the purchase.User browses the eshop&adds a TV in the cartUser re-enters the flowCart abandonment retargeting can recover more than 10%of revenues that would otherwise be lost21.Gets retargeted via an SMSUser drops off30 later21 Upstream proprietary dataAdd to cart1COMPUTING TV&AUDIO SMARTPHONESSamsung 55 Inch S95B QD OLED 4K HDR Smart TVColorsCharacteristics|Overview E-shopRating AccessoriesExtra Protection11:20 AM100%Checkout1E-shop11:20 AM100%PAYMENT SHIPPING YOUR CARTSubtotal:Total:$1400$1400Smart TV$1400-1 $1400Product:Price:Quantity:Total:11:20 AM100%E-shopType your replyHey,havent you forgotten something?Complete your order now&enjoy free shipping:Upp.st/56|27White paper|Mobile IdentityRetargeting based on the stage of the funnelAn MNO subscriber interacts with an ad,which suggests a mobile plan upgrade.The user identification technology detects the level of interaction and the stage of the funnel the user reached,retargeting them accordingly.For example,a user who has just bounced off the website will receive an SMS message suggesting an exclusive mobile data offer to make them reconsider the plan upgrade.Retargeted via SMS Retargeted via RCS Retargeted via Email BOUNCEDEntered the online flow&did nothing123StepStepStepStep4INTERACTEDEntered the 2nd step but never progressedEMAILUser added their email and then bouncedLAST STEPUser successfully completed the subscriptionSame daySame daySame day|28 White paper|Mobile Identity04Mobile operators have the most to gain|29White paper|Mobile IdentityAn opportunity MNOs cant afford to missMobile operators can use Upstreams Mobile Identity to serve their own purposes.This includes creating a better user experience for their subscribers,making personalized offers according to the individual user visiting their website,and retargeting customers leaving the funnel.Whats even more interesting though,is that operators can monetize the technology by offering it to third parties and becoming lead players in new industries.They can offer this value-added service to content partners,e-commerce brands,and other players to help them increase tailored advertising capabilities,create better experiences for their users,and provide new password-less solutions for authentication.The cherry on top is that no additional investment is needed to make this happen,as the solution provided by Upstream is software based and needs only minimal integration.The technology represents a significant opportunity for operators to not only grow new revenues,but to avoid being sidelined in the evolving world of digital marketing.The digital marketing sector is a market worth$455 billion22 in 2021,with mobile marketing representing 75%of this.The opportunity is there for the taking in a field where MNOs have not previously been able to compete as effectively.MNOs make significant investments in infrastructure to connect billions of users through their networks and deliver the best possible experience.However,messaging apps such as WhatsApp,Viber and Facebook Messenger reap billions in revenue each year by operating over the top.The change in the digital marketing landscape is a chance for operators to tip the scales back in their favor,once and for all.22 Worldwide Digital Ad Spending 2021,Insider Intelligence&eMarketer,April 2021 Mobile operators can monetize Upsteams Mobile Identity by offering it to third parties.455bnDigital marketing sector is a market worth75%Mobile marketing represents$20212021|30 White paper|Mobile Identity05Knowing your customer is easier than ever|31White paper|Mobile IdentityUser identification has never been easier or more efficientUpstreams Mobile Identity is a new,transparent and user-friendly identification technology,replacing and improving upon older technologies,like one-time passwords and third party cookies.It can be implemented by MNOs both for their own campaigns and as an offering to other businesses.With their permission,users can be served personalized,targeted content thats relevant to their interests and circumstances,boosting sales,conversions,and revenues.Upstreams solution takes users from the open web and communicates with them through mobile messaging,as part of Upstreams mobile marketing automation platform,Grow.It also brings dramatic upgrades to user experience and security.For too long,MNOs have struggled to capitalize on the business opportunities afforded by mobile marketing.Taking advantage of web user identification via MSISDN technology represents a huge opportunity,and one that MNOs cannot afford to miss.Upstreams patent is the easiest way to turn online users into identified marketing prospects and paying customers.If youd like to learn more about how Upstreams user identification technology over the mobile network can help mobile operators become essential digital advertising partners in the post-cookie era,get in touch.Upstreams solution identifies users on the open web and engages them through mobile messaging.|32White paper|Mobile IdentityAbout UpstreamUpstream is a leading MarTech company in some of the most promising and rapidly growing markets in the world.It is the go-to partner for companies across industries seeking to achieve digital growth.Upstreams proprietary mobile marketing automation platform,Grow,combines innovations in the field of marketing automation,multi-channel digital communications,data collection and analysis,user identification,and security from online advertising fraud.These capabilities create personalized experiences for end consumers leading to higher customer engagement and satisfaction,and better monetization.Through the platform,all the different communication channels are controlled via a single UI.Any company deploying Grow is guaranteed a high ROI,paying for actual results based on the goals set.Grow is available both as a managed service and as SaaS.Upstream currently works with more than 100 MNOs and e-Commerce,Insurance,Banking,Education,Retail and FMCG companies across Latin America,Africa,the Middle East and Southeast Asia.For more information on how you can leverage Upstreams Mobile Identity,visit:orsend an email to global-

    浏览量21人已浏览 发布时间2022-12-22 32页 推荐指数推荐指数推荐指数推荐指数推荐指数5星级
  • 艾瑞咨询:2022年中国智能语音转写行业研究报告(43页).pdf

    释放数字生产力,留存探索语音内容2022.12 iResearch Inc.智能语音转写行业研究报告2研究背景:研究对象:在工具不发达的年代,会议记录主要依靠人力完成,以多人合作的分工形式提升记录效率。后随着记录工具不断升级和专业培训,人工转写的效率也在不断提升,专业速录师可依靠速录机完成会议等场景的转写需求,但成本较高。后随着互联网及人工智能技术的不断发展,智能语音转写产品应运而生。尤其在 2011 年,大量研究人员转向深度学习在智能语音领域的研究,利用大数据、机器学习和大算力“三驾马车”,让语音识别的识别准确度再一次得到明显提升,智能语音技术迎来落地应用的发展期。”工欲善其事,必先利其器“,智能化的语音转写服务以价优、质高、便捷的优势满足了转写记录这一交流场景的需求痛点,并在远程办公、新媒体、国际化交流的需求背景下,未来保持强劲市场增长力。作为语音识别技术的产品应用,智能语音转写产品是可以支持长音频识别的语音转文字服务,分为实时语音转写与非实时语音转写,可为信息处理和数据挖掘提供基础。研究方法:本报告通过业内资深的专家访谈、桌面研究、产品对比研究、行业数据统计与行业规模推算输出相应研究成果。艾瑞咨询产业数字化研究部人工智能研究组报告撰写前言对此,艾瑞发布中国智能语音转写行业研究报告,从语音识别-智能转写产品角度出发,确立智能语音转写服务的范围定义,描绘智能语音转写服务的产业图谱与需求市场,梳理智能语音转写服务在 SaaS 软件服务及本地解决方案的不同产品形式、商业模式及厂商格局,并为中国智能语音转写行业的趋势发展提供分析判断,希望通过本报告,为读者呈现中国智能语音转写的产业发展背景、行业厂商动态、产品发展机遇的多维视角,欢迎各界探讨指正。32022.12 iResearch I摘要来源:艾瑞咨询研究院自主研究绘制。从技术趋势来看,语音识别技术的精度和速度仍取决于实际应用环境,面对“混合语种”“嘈杂环境”下的“多人”“交互”“重叠”等多重因素交织的复杂语音场景,语音转写技术应用仍有待突破;从场景价值来看,如今智能转写应用领域大多仅服务于从语音到文字转写内容的实现,未来转写应用可结合自然语言理解、机器学习、知识图谱等AI技术,拓展转写产品的场景边界,深入挖掘转写内容价值,以更高阶、智能的辅助替代角色,为客户提供问题预警、策略总结、决策分析等功能服务;从厂商策略来看,各家将以构建自身产品生态,加强外部场景合作为策略核心,基于自身企业特点选择差异化侧重,共同推进转写技术的应用渗透与市场发展。近年来,智能语音技术与互联网、企业服务、消费硬件、传媒、医疗健康等各行业的深度融合带来了新的用户需求增长和商业模式创新,创造产业经济价值、繁荣产业生态,算法模型、优质数据集与多样化应用场景助力产业规模走高。部分智能语音产品如语音助手、语音转写、智能客服等取得产品价值突破或商业上的显著成就,语音识别相关产品多已进入稳步上升期。但在细分产品的交互体验、使用效果、场景优化等方面仍面临长期求索。人们面对“AI”时希望得到的自然、类人、甚至高信息密度的交互体验,仍然是一个宏伟的开放性课题。在人力成本、协同办公、传媒音视频、会展交流、跨国沟通等多重因素驱动下,中国智能转写市场不断注入需求活力,2021年中国智能语音转写市场规模已约为10亿元。未来,随着智能转写的技术突破、功能丰富及场景泛化,智能转写市场规模将加速上扬,预计2026年市场规模将达到38亿。从产品形态来看,智能转写产品主要包括SaaS类产品与本地化部署解决方案两大类。其中,SaaS市场头部聚集效应显著,讯飞听见与搜狗听写位列第一梯队,讯飞听见在转写准确率尤其是小语种和方言等、产品丰富度、品牌影响力和发展潜力维度拔得头筹。未来,SaaS形式API调用与垂类解决方案将形成合力,构成智能语音转写产业既快且稳的增长飞轮,高生态活性加硬解决方案实力的企业将更能突出重围,抢占更多增量市场。语音识别产品早期主要是语音听写,即用户说一句、机器识别一句;后来发展成语音转写,更聚焦于人人交流场景。智能语音转写是可以支持长音视频的语音转文字服务,附加产品服务、多语种翻译、内容分析等智能化功能,满足用户在会议、庭审、采访、直播、视频制作、客服质检等场景中的实时与非实时语音转写需求。随着语音识别准确性及效率的提升、多语种与方言转写服务丰富,以及上下文纠正、标点过滤、自定义热词配置、声纹角色分离、语音内容分析提取等功能的逐步优化,智能语音转写服务的商业化落地与多场景复用持续推进,成为语音识别产品的“排头兵”。智能语音产业发展智能语音转写产品智能语音转写市场智能语音转写趋势洞察4智能语音转写行业发展背景篇1智能语音转写行业市场分析篇2智能语音转写行业典型企业案例3智能语音转写行业发展趋势篇452022.12 iResearch I智能语音产业的宏观背景数字信息输入输出的重要载体,人工智能产业落地“先锋军”智能语音技术指通过声音信号的前端处理、语音识别(ASR)、自然语言处理(NLP)、语音合成(TTS)等技术形成完整的人机语音交互流程,是实现人与机器交流的纽带,也是数字信息输入与输出的重要载体。近年来,智能语音技术与互联网、企业服务、消费硬件、传媒、医疗健康等各行业的深度融合带来了新的用户需求增长和商业模式创新,创造产业经济价值、繁荣产业生态。智能语音产业的迅速发展促进了我国数字经济发展、提高了社会治理的智能化水平、推动了我国人工智能技术创新的战略突破。作为人工智能产业落地的“先锋军”,智能语音产业得到了国家和地方政策的有力支持,且随着参与者不断进入智能语音赛道,“百舸争流,千帆竞发”,产业技术水平和产品竞争力不断提高。来源:艾瑞研究院根据公开资料自主研究绘制。发布日期相关机构重点内容2022-05国务院办公厅强化科技赋能,进一步加强12345平台和网上12345能力建设,开发智能推荐、语音自动转写、自助派单功能2021-11工信部工业和信息化部批复组建国家智能语音创新中心,将围绕多语种语音识别、语音合成、语义理解和专用人工智能语音芯片等研发方向,构建集共性技术研发、测试验证、中试孵化和成果转移转化于一体的创新平台2021-01国务院办公厅提出加强自助下单、智能文本客服、智能语音等智能化应用,方便企业和群众反映诉求建议2020-10工信部鼓励智能家居产品普及语音控制功能,推动基于智能语音识别技术的智能音箱、智能可穿戴设备及其他智能家电产品开发,老年人可通过语音方式实现便捷化操作2019-02最高人民法院全面提升语音识别技术在庭审语音同步转录中的应用效能,建成全国法院智能语音云平台,实现全国法院语音识别的模型共享和统一管理2018-04国务院办公厅开展智能医学影像识别、病理分型和多学科会诊以及多种医疗健康场景下的智能语音技术应用,提高医疗服务效率2017-07科技部公布了首批国家新一代人工智能开放创新平台,包括自动驾驶、城市大脑医疗影像和智能语音2017-07司法部大力发展电子公证、法律服务智能保障等业务模式,推进人工智能语音热线和社交网络法律服务机器人技术研发,促进公共法律服务提档中国智能语音产业典型应用场景及政策汇总(部分)传媒制作智能机器人智能客服智能家居协同办公62022.12 iResearch I智能语音产业的市场规模2022年智能语音市场规模达215亿元,产业规模持续走高近年来,我国人工智能产业维持稳步增长态势;其中,智能语音产业基于语音识别等算法模型突破、优质数据集积累和丰富的下游应用场景创新,已进入规模化深耕阶段。我国头部智能语音企业、大型互联网企业等纷纷以“开放平台 垂直赛道”的发展模式,一方面通过语音开放平台为各行业开发者提供智能语音技术支撑,协作场景与产品创新,助力产业规模增长;另一方面凭借各自在消费硬件、协同办公、视频直播等领域的行业理解与用户生态,持续拓展智能车载、娱乐传媒、协同办公、智慧医疗、在线教育、智能家居等垂直行业赛道,以语音为信息的出入口,构建泛语音产业生态集群。2022年中国智能语音产业规模可达215亿元且维持较高增速,预计到2026年产业规模可达469亿元。注释:智能语音典型产品包括对话机器人、智能硬件中的AI语音助手以及教育、医疗、司法、公安、互联网等垂直行业中的智能语音产品及应用。来源:艾瑞咨询研究院根据专家访谈,结合艾瑞统计模型自主研究绘制。2019-2026年中国智能语音产业规模7710915921527233139646941.6E.95.2&.5!.7.6.4%-1 5 0.0%-1 0 0.0%-5 0.0%0.0%5 0.0%1 0 0.0 0 02 0 03 0 04 0 05 0 06 0 07 0 08 0 02019202020212022e2023e2024e2025e2026e智能语音产业规模(亿元)智能语音产业增速(%)CAGR=16.9r022.12 iResearch I智能语音产业的产品成熟度语音识别相关产品多已进入稳步上升期人类对机器语音识别的探索始于20世纪50年代,迄今已逾70年。2016年,在深度神经网络的帮助下,机器语音识别准确率第一次达到人类水平,意味着智能语音技术落地期到来。后随着近场语音识别准确率提升、远场语音识别和唤醒发展、全双工语音交互出现、基于NLP的对话和问答能力逐渐成熟、知识图谱技术助力对话引擎以及针对实际应用中的算法优化,智能语音技术的落地可用性不断突破。但其背后涉及的声学研究、模式识别研究、通用NLP研究及垂直场景的深度语义理解等还未成熟到拼成一个没有明显短板的“木桶”。因此尽管部分智能语音产品如语音助手、语音转写、智能客服等已取得了产品价值突破或商业上的显著成就,但在细分产品的交互体验、使用效果、场景优化等方面仍面临长期求索。人们面对“AI”时希望得到的自然、类人、甚至高信息密度的交互体验,仍然是一个宏伟的开放性课题。来源:艾瑞咨询研究院自主研究及绘制。2022年中国智能语音产品成熟度曲线分布阶段表示智能语音相关技术的一阶产品,可衍生出各细分领域的产品应用。如智能客服 金融、基于语音助手的智能音箱等表示智能语音技术二阶产品,如基于语音识别技术的智能语音转写产品、基于语音合成技术的语音播报等智能客服产品成熟度萌芽探索期落地实践期飞跃发展期稳步上升期生产成熟期语音识别智能语音开发平台语音芯片声纹识别语音输入法语音转写智能车载语音助手生成式AI(音频)语音审核智能消费硬件语音合成语音播报自然语言处理技术落地初期阶段,产品成熟度较低产品普及率提升,成熟度曲线处于缓慢爬坡期产品规模化应用,成熟度快速增长成熟度趋于稳定,产品及服务差异化竞争阶段成熟度稳定阶段。但技术若出现跨越性突破,产品或回到飞跃发展阶段82022.12 iResearch I智能语音转写的定义与分类语音识别产品的重要输出形态,分为实时与非实时语音转写语音识别产品早期主要是语音听写,即用户说一句、机器识别一句;后来发展成语音转写,更聚焦于人人交流场景。智能语音转写是可以支持长音视频的语音转文字服务,分为实时语音转写与非实时语音转写,可为信息处理和数据挖掘提供基础。适用于线上线下会议记录转写、影视字幕制作、媒体新闻工作、会议翻译等多个应用情境。作为数字化劳动力,解决刚需问题,有效提高办公效率。随着语音识别准确性及效率的提升、多语种与方言转写服务丰富,以及上下文纠正、标点过滤、语气词过滤、自定义热词配置、声纹角色分离、语音内容分析提取等智能化服务功能的逐步优化,智能语音转写服务的商业化落地与多场景复用持续推进,成为语音识别产品的“排头兵”。来源:艾瑞咨询研究院自主研究及绘制。0102实时语音转写智能语音转写产品定义与分类实时语音转写(流式上传-同步获取):实时语音转写可将不限时长的音频流实时识别为文字,并返回带有时间戳的文字流;可用于直播实时字幕、实时会议记录;也可配合机器翻译,实现同传功能。非实时语音转写非实时语音转写(已录制音频文件上传-异步获取):非实时语音转写将长段音频数据转换成文本数据。可用于影视字幕制作、会议访谈记录转写、智能客服录音质检等场景。语音识别作为智能交互中的一环,进行语音识别,让机器“理解”人类说的话语,而非以识别为最终产品目的语音转写:支持长音视频的语音转文字服务,可为信息处理和数据挖掘提供基础。92022.12 iResearch I语音识别系统技术架构实现对声音波形序列的识别,得到相应的单词或者字符序列智能语音转写产品的核心是语音识别系统,需实现对给定的声音波形序列的识别,得到相应的单词或者字符序列。语音识别系统由信号处理和特征提取、声学模型(Acoustic Model,AM)、语言模型(Language Model,LM)和解码搜索共四部分组成。识别过程首先对音频流进行处理,通过消除噪声和信道失真对语音进行增强,然后分割声音片段并转换成一系列数值,通过声学模型识别数值,最终利用语言模型解码搜索匹配得到最优的词序列作为识别结果输出。声学模型和语言模型的获得需对预先收集好的海量语音、语言数据库进行信号处理和知识挖掘训练。解码过程中还存在一个“自适应”反馈模块,可对用户的语音进行自学习,从而对模型进行校正,进一步提高识别准确率。来源:艾瑞根据CSDN等公开资料整理研究绘制。信号处理和特征提取解码搜索声学模型语言模型智能语音转写产品核心语音识别系统的技术结构音频信号声音特征语言模型得分识别结果信号处理和特征提取:以音频模拟信号输入,将其转为数字信号,提取声音特征,供声学模型提取合适有代表性的特征向量。Step1Step2声学模型将声学和发音学(Phonetics)的知识进行整合,以特征提取部分生成的特征为输入,并为可变长特征序列生成声学模型分数。语言模型通过训练语料/数据(通常是文本形式)学习词之间的相互关系,来估计假设词序列的可能性,找出该声音特征最有可能对应的文字序列。Step3解码搜索:对给定的特征向量序列和若干假设词序列计算声学模型分数和语言模型分数,将总体输出分数最高的词序列作为识别结果。声学模型语言模型打开空调 0.95大凯空调 0.70大楷空条 0.35da kai kong tiao0.85 0.950.700.85da kai zhao ming0.85 0.950.200.15声学模型得分102022.12 iResearch I语音识别技术发展历程声学模型突破引领技术商业落地进程从最初的基于孤立词的小词汇量语音识别系统,到目前的基于大词汇量的连续语音识别系统,语音识别技术取得了显著的进展。语言模型主要基于传统的N-Gram方法(一种基于统计语言模型的算法)进行统计匹配。虽然目前也有深度神经网络的语言模型的研究,但在实用中主要还是更多用于后处理纠错。或加入NLP Embedding模型,联系上下文,以提升语音识别结果准确率。而纵观其技术落地的突破路径,对于声音模型的研究优化是实现产品性能提升的主旋律。声学模型是语音识别系统的重要组成部分,占据着大部分的计算资源并决定着语音识别系统的性能。2009年随着深度学习技术发展,基于DNN-HMM的语音声学模型成为主流,语音识别因此取得了突破性进展;此后,不同的网络结构组合以及优化策略极大提升了声学模型的性能,如端到端的识别模型、粗粒度的建模单元、更复杂的深度神经网络等。来源:艾瑞研究院根据公开资料自主研究绘制。语音识别技术中声学模型的突破路径u深度神经网络方法主导2006至今u概率统计方法主导1970s2006u模板匹配方法主导 1970s模板匹配识别:提取语音信号的特征构建参数模板,将测试语音与参考模板参数进行比较匹配,取距离最近的样本所对应的词标注为该语音信号的发音。该方法可有效解决孤立词识别,但难以实现大词汇量、非特定人连续语音识别。概率统计识别:隐马尔可夫模型(HMM)和 高 斯 混 合 模 型(DMM)。GMM-HMM框架中,GMM用于对语音声学特征的分布进行建模,HMM则用于对语音信号的时序性进行建模。自上世纪90年代语音识别声学模型的区分性训练准则和模型自适应方法被提出以后,语音识别进入缓慢发展期。2006年:深度学习进入发展元年。2019年,Hinton将DNN应用于语音的声学建模;2011年底,微软研究院将DNN技术应用在了大词汇量连续语音识别任务上,大大降低了语音识别错误率。从此语音识别进入DNN-HMM时代。此外LSTM(递归神经网络模型)具有长短时记忆能力,整体性能比DNN有相对20%左右稳定提升2015-2017:基于端到端识别模型可去除HMM,直接从声学特征输入就可以得到识别的词序列,进一步提升语音识别准确率及解码速度。2017年以后:随着各种深度神经网络以及端到端技术的兴起,业界厂商纷纷发布及持续优化各自声学模型结构。语音识别准确率持续提升。以科大讯飞为例,2010年中英文识别准确率只有60%左右,而在2021年8月,科大讯飞厂商的中英文转写准确率已突破98.33%。112022.12 iResearch I智能语音转写的需求场景以转写功能为基础,满足细分场景需求,构成丰富产品形态自从以远场语音技术落地为代表的智能音箱产品规模化应用、深度神经网络下的声学模型研发创新进入平稳发展期后,语音识别赛道的产业竞争已经从标准环境下的算法研发比拼,过渡到了在真实细分需求场景下如何满足用户体验的竞争。智能语音转写产品也遵循这一赛道特征,以语音转文字功能为基础,附加产品服务、多语种翻译、内容分析等智能化服务功能,满足用户在会议、庭审、采访、直播、视频制作、客服质检等场景中的实时与非实时语音转写需求。智能语音转写产品具备丰富的产品形态,可应用于娱乐传媒、在线教育、会议会展、同传等多行业领域,帮助提升企事业单位办公人群、学生、自媒体从业人员、翻译专业人士等各类群体的工作效率。来源:艾瑞研究院根据公开资料自主研究绘制。智能语音转写产品的需求场景转写功能语种翻译产品服务内容分析实时场景非实时场景提供会议记录及会后整理,可附加会议软件等产品功能提供字幕转写服务,可附加音视频编辑相关产品功能提供语音转写服务,在多语种环境下,附加实时/非实时翻译功能提供人机耦合服务,译员配合智能转写内容优化最终产出提供语音转写服务,对转写文本进行内容追踪、实时提醒、处理分析、风控质检等等操作实时会议记录实时直播字幕实时庭审记录实时客服记录会议纪要总结音视频字幕编辑庭审数据录入黄暴等语音质检对响应时间要求更高,需进行模型蒸馏与模型优化对响应时间要求相对较低,可通过闲时转写实现需求错峰实时采访转写实时会议同传课堂录音分析电话销售/客服122022.12 iResearch I智能语音转写的价值意义存量助力人工转写市场,增量释放更多潜在场景需求传统人力转写市场依赖经验丰富的速录师与人工转写团队,成本相对高昂,而随着智能语音转写产品的规模化落地应用,该类存量市场可借助智能转写产品,实现对人工转写的有效辅助及优化,为下游客户提供更高质效的人机耦合服务;此外,转写应用仍有更大规模的潜在市场需求待挖掘,原受限于渠道、价格等因素,转写产品多应用于有垂类转写需求的小众应用领域,而智能语音转写产品逐步让转写应用实现泛化,市场边界也将逐步扩散,未来智能语音转写产品有望开发更多潜在增量市场,撬动可用智能转写产品满足的长尾需求,进一步优化用户的应用体验。来源:艾瑞研究院根据公开材料自主研究绘制。智能语音转写产品在助力转写人力基础上,可满足更多潜在、可被优化的转写场景需求。智能语音转写产品意义1)优化传统转写人力服务2)满足更多潜在可被优化需求增量市场存量市场本身场景存在潜在转写需求,但人力实现需要高成本或原本人力难以做到,而智能转写产品可开发该类潜在增量市场,释放更多产值规模。传统人工转写费时费力,且转写质量与个人能力高度挂钩,可借力智能转写产品提高存量市场的转写服务渗透率。通过智能语音转写产品撬动更多长尾需求例:个人办公场景,有会议内容的潜在转写需求,出于时长与精力考量不会自做,出于成本考量不会外购,但可通过智能语音转写产品获得优质高效、兼具性价比的转写服务。例:沟通交流场景,在多语种、方言沟通的日常交流环境中,存在潜在语音转写需求,可通过转写产品跨越语言障碍,实现高效沟通。13智能语音转写行业发展背景篇1智能语音转写行业市场分析篇2智能语音转写行业典型企业案例3智能语音转写行业发展趋势篇4142022.12 iResearch I智能语音转写产业图谱来源:艾瑞根据公开资料自主研究绘制。下游应用领域2022年中国智能语音转写产业图谱办公场景传媒场景电商直播翻译场景上游基础设施层产品及解决方案提供商服务器云服务数据服务开源模型智能语音企业云服务厂商专业转写/翻译厂商C端用户B端企业G端政府其他场景152022.12 iResearch I智能语音转写的发展驱力(1/5)智能语音转写可化解人工成本走高与质量要求提升的发展矛盾近十年来,中国人口增势放缓,劳动人口红利见顶,供应结构性短缺致使企业人力用工成本不断攀升。根据国家统计局数据,2020年中国租赁和商务服务业城镇单位就业人员平均工资已达到92924元,相比十年前涨幅已达到1.35倍。人工转写成本的大幅上涨为转写行业带来更多价格压力。此外,随着转写场景的泛化升级,转写需求渗透到各行各业,转写内容专业度也不断提升,具备行业背景知识的转写译员更成为市场供给侧的稀缺人力资源,且转写交付水平存在不稳定性,与个人服务能力高度挂钩。在此发展背景下,转写市场亟需智能语音转写产品,以辅助优化人工转写产品的角度切入,提供低成本、高质量、具备稳定交付水平的转写服务,满足更多市场需求缺口。395664697653162625386713172489767828139385147881909292418.7.2.6%7.3%8.0%5.9%6.0%4.6%3.6%5.4 102011201220132014201520162017201820192020租赁和商务服务业城镇单位就业人员平均工资(元)平均工资增长率(%)2010-2020年中国租赁和商务服务业城镇单位就业人员平均工资情况来源:国家统计局,艾瑞研究院自主研究绘制。162022.12 iResearch I2022.12 iResearch I智能语音转写的发展驱力(2/5)企业协同在线办公常态化,助力转写功能实现更多用户触达2020年初,受疫情影响,很多企业无法按时复工复产,远程办公成为维持社会经济正常运行的重要平台应用,用户需求显著提升,视频会议、电话会议、在线文档编辑等远程协作功能得到更广泛应用。根据中国互联网络发展统计报告数据,2022年月中国在线办公用户规模已跃升至4.7亿,相比2020年6月增长幅度高达131.4%。如今疫情仍在延宕反复,随着用户在线协同办公习惯的逐渐养成,远程协同办公或将成为常态化运营工具,持续推动企业数字化转型。而相较于硬件录音与录音应用的产品形式,会议应用无需用户购买录音设备或额外开启录音应用即可触达转写服务,提供了更直接的应用切入点,助力转写功能在办公场景实现更广泛的用户触达。来源:中国互联网络发展统计报告,艾瑞研究院自主研究绘制。来源:艾瑞研究院自主研究绘制。2018年6月-2022年6月中国在线办公用户规模及使用率2.0 3.5 3.8 4.7 4.6 21.24.97.7E.4C.8 20.62020.122021.62021.122022.6用户规模(亿人)使用率(%)办公场景对智能转写产品的需求分析录音应用硬件录音会议应用转写产品办公场景e.g.录音笔e.g.语音备忘录e.g.腾讯会议、讯飞听见专业办公人士,高频录音场景,对会议转写有强需求,需要额外硬件设备提供在线/离线转写服务。会议APP提供远程会议平台,通过会议APP录制音视频,为转写产品提供直接功能切入点。通过手机或电脑的录音软件录音,随后将录音文件上传至平台或APP,完成录音文件转写。需要硬件设备 需要额外录音 搭载办公会议平台相较传统需要录音笔与录音应用的场景,协同在线办公平台及会议应用让转写功能触达到更多办公人群,应用渗透率进一步提升。172022.12 iResearch I2022.12 iResearch I智能语音转写的发展驱力(3/5)网络视频兴起,为转写产品开拓更多应用空间随着数字技术与互联网技术的普及,网络视频快速发展,短视频因满足用户高涨的碎片化娱乐需求而迎来一拨爆发式增长,进一步提升用户对整体网络视频领域的关注度与渗透率。如今网络视频已然成为人们生活娱乐、了解信息的重要组成形式。根据中国互联网络发展统计报告数据,2022年6月,中国网络视频用户规模已经达到9.9亿人,占全部网民的94.6%。作为网络视频的供给方,自媒体工作者、长视频内容编辑方均对视频内容的字幕转写具备强需求,一方面字幕可帮助用户更好观看视频内容,并在静音模式也不影响观看;另一方面字幕转写还可提供翻译功能,助力网络视频在国际环境下的推动传播;此外,对于平台监管方来说,语音转写可服务于平台内容监控需求,及时进行内容管理,避免网络直播及视频带来的合规风险。综合来看,网络视频的长足发展为转写产品开拓了更多市场应用空间。来源:中国互联网络发展统计报告,艾瑞研究院自主研究绘制。来源:中国互联网络发展统计报告,艾瑞研究院自主研究绘制。7.1 7.2 7.6 8.5 8.9 9.3 9.4 9.7 9.9 88.7.5.8.1.5.7.4.5.6 18.62018.122019.62020.32020.62020.122021.62021.122022.6用户规模(亿人)使用率(%)2018年6月-2022年6月中国网络视频(含短视频)用户规模及使用率网络视频对智能转写产品的需求分析自媒体多语种转写长视频编辑语音内容监控服务于内容生产用户,智能切分时间轴。生成带时间戳的转写字幕内容,支持在线编辑调整,极大提升自媒体工作者的字幕配置效率。为外语视频提供转写及翻译服务,可根据需要配置专业翻译团队,实现高效人机耦合,完成多语种的字幕制作及翻译需求。服务于长视频编辑工作者,例如电影、纪录片等,长视频的语音转写更强调上下文联系及方言理解,对语音技术提出更高要求。实时转写可实时识别直播内容风险,并给出及时警告提示;非实时转写可对平台内容进行进一步甄别提示。182022.12 iResearch I2022.12 iResearch I智能语音转写的市场环境(4/5)会展双线融合举办不断提升,SaaS转写产品需求走高在2020年以前,会展行业多在线下举行。面对国际语言的交流环境,会展行业的字幕转写产品大多采用线下人机耦合的服务模式,即专业的语音转写服务团队与硬件机器设备相结合,为会展交流提供字幕上屏、多语种同传等的现场会议服务。而在疫情多点散发的情况下,会展活动的举办面临很多不确定性因素。根据中国会展主办机构数字化调研报告显示,2021年,疫情导致各类会展活动取消、延期、异地举办,会展活动选择线上线下相结合模式举办成为常态。字幕转写产品形态也由原来线下的人机耦合形式逐渐倾向于线上SaaS服务形式,并可配合线上人工智能服务团队或翻译团队提供实时校验服务。此外,SaaS产品形态的需求延伸进一步丰富转写产品的客群覆盖度,除会展举办方外,更多C端用户也可通过SaaS转写及翻译产品满足个人国际参会、实时翻译的会展需求。2021年中国会展主办机构办展办会方式注释:N=195。来源:DRCEO:中国会展主办机构数字化调研2022,艾瑞咨询研究院整理及绘制。2021年中国会展主办机构调研主要数据注释:N=195。来源:DRCEO:中国会展主办机构数字化调研2022,艾瑞咨询研究院整理及绘制。根据调研显示,近70%的主办机构选择双线融合办展的方式,线上线下结合已成为会展常态。u双线融合办展趋势31.3%的机构认为数字化转型是大方向,超过50%的机构已经开始数字化转型尝试。u数字化转型方向根据调研显示,超过60%的会展机构能获得各位数字化收入。但数字化收入占比有待提升。u数字化收入占比超过90%的机构对数字化转型呈积极与乐观态度,该比例相较于2020年提升6个百分点。u数字化转型态度735083271022131483320214910161纯线下举办纯线上举办线上 线下相结合举办1-3场(个)4-5场(个)6-10场(个)10场以上(个)以上均没有举办(个)线上会展成为线下举办的延伸助力,线上 线下呈现深度融合的发展趋势192022.12 iResearch I2022.12 iResearch I智能语音转写的市场环境(5/5)转写产品助力解决出海生态下的复合型翻译人才需求近年来虽然新冠疫情反复、地缘冲突加剧,全球经济发展变数频发,但中国企业出海浪潮已逐渐越过探索期,在视频、游戏、电商、企业级SaaS服务等各领域催生出“出海繁荣”。2021年,中国对外直接投资净额1788.2亿美元,比上年增长16.3%,连续十年位列全球前三,且超越出现统计数据以来首次负增长的2017年绝对值。目前,由于海外市场仍处于高速增长阶段且出海市场各赛道集中度不高,我国出海行业仍具有极大潜力,在企业业务运营、跨国交流等领域对复合型翻译人才需求较大。根据中国翻译协会调研,高级翻译人才稀缺、非通用语种人才匮乏、高校教育与实际工作需求脱节、无法满足多个专业领域翻译需求是翻译行业面临的发展难点。在此背景下,智能语音转写产品的翻译及同传功能,不仅能有效提高翻译工作者的工作效率,同时人机耦合的形式也使各领域的非翻译专业人才具备完成业务需要翻译工作的可能性。来源:商务部、国家统计局和国家外汇管理局,艾瑞研究院绘制。来源:中国翻译协会2022中国翻译人才发展报告,艾瑞研究院绘制。2016-2021年中国对外直接投资净额1582.9 1430.4 1369.1 1537.1 1788.2-19.3%-9.6%-4.3.3.3 172018201920202021中国对外直接投资净额(亿美元)增长率(%)2021年中国复合型翻译人才需求情况31%8%8%7%6%外交学、国际关系新闻传播类理工及其他专业法学类经济学类哲学类、中国语言文学类电子信息类、管理科学与工程类202022.12 iResearch I智能语音转写的行业规模需求活力持续注入,预计2026年市场规模达38亿目前,智能转写产品率先在办公会议、传媒音视频、会展交流等领域展开应用,用户接受度日益成熟。据艾瑞研究院统计测算,2021年中国智能语音转写市场规模已约为10亿元。未来,随着智能转写的技术突破、功能丰富及场景泛化,智能转写市场规模将加速上扬。此外,转写产品可结合NLP、知识图谱技术在单纯转写内容的基础上升级为分析策略的输出层级,释放更多价值势能,预计2026年中国智能语音转写行业市场规模将达到38亿元,2021-2026 五年CAGR=30.7%。来源:艾瑞研究院根据桌研与专家访谈自主建模测算。2021-2026年中国智能转写行业规模10131722293828.3).91.12.91.4 212022e2023e2024e2025e2026e智能转写行业规模(亿元)智能转写行业规模增长率(%)212022.12 iResearch I智能语音转写的参与者类型以语音技术、产品生态、细分领域为多样立足点根据参与厂商的市场立足点划分,智能语音转写赛道的玩家可分为语音技术厂商、云服务厂商与专业转写及翻译服务商。其中语音技术厂商在语音识别能力、转写服务水平上具备先发优势,且投入足够精力进行技术研发与产品打磨,产品化能力优秀,现占据智能语音转写市场的主流厂商地位;而云服务厂商的转写能力对内服务于内部产品的转写功能需求,对外多选择开放语音转写能力达成外部合作以丰富平台生态,垂直于转写的产品化能力较弱;专业转写及翻译厂商通常以细分领域切入,深耕于办公、翻译、传媒等某个细分领域,在垂类市场提供精细化、客制化产品及解决方案,满足细分客户的转写服务需要。来源:艾瑞研究院自主研究绘制。智能语音转写参与者类型分析以语音技术切入以产品生态切入以细分领域切入语音技术厂商云服务厂商专业转写/翻译厂商强于语音识别能力,为客户提供语音转写接口、SaaS产品及全套解决方案等多样化转写产品形式。除软件服务外,硬件设备是触达用户的核心端口,部分语音技术厂商选择从AIoT领域切入,依托于智能耳机、智能录音笔、智慧屏等智能硬件产品进一步开拓转写应用场景传统转写或翻译服务商,持续积累垂直转写需求客群,顺应智能转写技术发展,切入细分领域,提供人机耦合的优化产品服务。依附公司产品生态,见长于平台化能力,在办公、泛娱乐、教育等场景搭配软硬件产品输出转写能力,一般分为对内与对外服务厂商代表:科大讯飞、搜狗听写、思必驰、捷通华声厂商代表:阿里云、腾讯云、百度云、火山引擎厂商代表:网易见外、迅捷语音222022.12 iResearch I智能语音转写的产品形态包括SaaS类产品及本地化部署解决方案,均可结合智能硬件智能语音转写服务的产品形态主要包括SaaS类产品与本地化部署解决方案两大类。以SaaS类产品为主,其核心是提供云端语音识别及转写服务,根据客户分类与应用情景差异,包括轻量级的网页版/APP/PC/小程序产品和提供给B/G端客户的API开发接口。SaaS类产品的主要特点是价格相对便宜、便捷度较高;而本地化部署的解决方案主要是为了满足客户的安全隐私与定制化需求,例如接入到政企内部办公平台等,需要服务商具备定制化开发能力。此外,为了提升语音采集的质量及多样化的移动应用场景,头部厂商如讯飞听见、搜狗听写等开发了种类丰富的功能性智能转写硬件,如录音笔、麦克风、智慧屏等,可提供云端或本地转写、录音、存储、编辑一体服务。来源:艾瑞研究院根据公开资料自主研究绘制。智能语音转写产品形态提供单机版软件/私有化部署SDK接口,在本地可运行语音识别及转写能力。满足客户的定制化需求与安全隐私需求,但部署成本高,主要面向对数据安全需求较高的大型企业或公检法、广电传媒等政府客户通过硬件内置芯片与本地词库,提供本地/离线转写服务。满足对数据及网络安全、便捷性及移动办公等需求。移动端转写能力与实用性的提升,扩充转写功能的适用范围SaaS类产品本地化部署解决方案通过Web/APP/PC/小程序等提供云端语音识别及转写服务,主要服务于C端客户或企业账户,企业账户或具备空间管理、协同编辑等增值服务。通过行业词库和模型优化,产品可满足传媒、教培、金融、客服等多场景应用需求以录音笔、麦克风、智慧屏等语音采集硬件为依托,调用云端语音识别及转写能力利用麦克风阵列,通过声学技术保障拾音效果,以提升语音采集精准度。软硬一体形式提升转写质量及效率,并满足会议、访谈等多类型需求场景丰富消费级智能硬件产品形态,提高产品售价、促进营收增长智能硬件价值点提供封装语音转写能力的API接口。下游应用开发商和手机、录音笔等智能终端厂商可进行集成232022.12 iResearch I智能语音转写的收费模式与用户画像知识密集行业用户的办公效率提升利器,下游客户类型丰富1)SaaS产品的前期投入主要集中于产品研发以及固定的IT支出,得益于其能够同时为多租户提供服务的特性,使得SaaS的边际成本极低。这既给SaaS厂商带来了相当可观的边际利润,也让厂商在面对同类竞争时得以在价格上做出更多让步。对于C端客户的语音转写服务需求,产品提供方在早期一般采取低价或免费试用时长的模式集聚用户,占领用户心智,迅速做大用户量。后期营收增长依赖满足准确率与实时率下的刚需客户续费率、深耕多样化场景以拓宽潜在客户市场、软硬一体的智能硬件产品拉高营收等;而企业客户的价格敏感度则相对较低,更关注转写精准度和实时性体验等。对于远程会议、视频剪辑、CRM等下游应用,则多将语音转写作为附加功能提供增值服务,用户可付费解锁。2)本地部署解决方案可满足政企客户的定制化与安全隐私需求。但部署成本高,项目制报价形式涵盖软件服务、实施与运维、硬件设备等费用。客户在关注转写效果的同时,亦关注安全性、驻场训练语料、设备安装等实施及售后服务能力。来源:艾瑞研究院根据公开资料自主研究绘制。智能语音转写产品的收费模式与用户画像免费应用后向广告收费按照时长和并发计费SaaS类产品单笔订单单笔付费按月/年订阅制储值卡(时长)企业账户附加功能转写服务转写能力接口一次性license智能硬件硬件付费 软件服务免费本地部署解决方案个人用户画像:主要是学生、媒体工作者、IT/金融办公人群等。主要来自于一二线城市的知识密集型行业。其中PC端使用者多为有强办公需求的企事业单位用户,更重电脑音频编辑企业账户画像:主要集中于影视剧后期、教培机构等企业账户基础收费模式同上,开通空间管理、协同编辑等增值服务赋能下游手机、录音设备等硬件厂商单机版软件费用项目制报价硬件设备费用免费使用转写、翻译等语言服务用户画像:主要面向政府、高校及大型企业。客户需要转写功能的对接与嵌入,对于数据安全、可拓展、灵活性要求更高,包括对需求响应的及时程度等私有化部署费用运维费用丰富的下游场景应用客户242022.12 iResearch I智能语音转写SaaS产品分析高便捷性、开箱即用、按需使用、快速响应及多场景优化1)基于SaaS的语音转写服务产品通过将音频文件上传至云端,由云端转写引擎进行识别、转写、纠错,完成实时或非实时的语音转写输出。终端用户可以在网页或者APP上获取结果,还可对结果进行编辑、分享、导出等操作。语音转写服务厂商通过多领域的语音转写模型优化和行业词库,迭代更新以提升不同应用场景下的转写准确率,服务多类型客户。随着云计算技术发展,目前云端算力和网络环境比较稳定,SaaS转写产品的转写准确率和效率与私有化部署解决方案的用户感知度差距不是特别显著。高便捷性、较低成本等优势使语音转写SaaS产品拥有庞大的终端消费群体。2)且SaaS形式的转写产品具有开箱即用无需维护、按需使用等特点,可被集成到下游应用软件或手机、智慧屏、录音笔、智能会议系统等各类硬件设备中。API转写引擎可支持远程会议、线上会展、电商直播、短视频、在线课堂等软件应用的纪要转写、字幕制作、同传翻译等功能,拓宽应用的产品服务边界。广泛的下游生态也有助于语音转写产品加速起量,扩大潜在市场空间。来源:艾瑞研究院根据公开资料自主研究绘制。付费方式灵活,可通过充值时长卡的方式随时使用转写服务或根据调用量及并发量订阅付费按需使用、成本较低进行语言模型和行业词库优化,满足多应用场景的客户转写需求。可应用于轻办公、会议会展、传媒、短视频直播、同声传译等领域针对多应用场景优化通过网络提供服务,用户可多设备、多渠道接入,随时访问;且数据储存在云端,实时同步高便捷性低时延,秒级甚至毫秒级处理返回语音识别结果,支持同传、直播等实时转写场景需求快速响应B端客户接入语音转写能力,可随时调用,模型及时迭代更新开箱即用、无需维护智能语音转写SaaS产品特点252022.12 iResearch I2022.12 iResearch I智能语音转写SaaS产品发展环境云计算普及助力下游企业便捷应用语音转写服务智能语音转写SaaS产品的普及推广离不开我国云计算基础设施的建设和技术成熟以及企业数字化转型趋势。我国云服务市场规模不断增长,2021年中国整体云服务市场规模为3280亿元,同比2020年增加45.4%,根据艾瑞咨询推算,未来几年的增速仍维持在30%以上。企业对云计算的接受程度也在不断提高。中国信通院数据显示,2019年中国企业应用云计算的比例达到66.1%,较2017年增长11.4pct,企业在经历信息化阶段后开始向数字化转型。而在企业数字化转型过程中,可有效提高会议交流、字幕转写编辑、同声传译等场景办公效率的语音转写SaaS产品,具备交付灵活、使用便捷等优势,且可降低企业现金流压力,对泛互联网等各类企业的数字化转型和办公效率提升具有重要意义。来源:艾瑞咨询研究院自主研究推算及绘制。来源:中国信通院来源:信通院2020年云计算发展白皮书,艾瑞咨询研究院自主研究及绘制。2016-2025年中国整体云服务市场规模及增速45.3A.43.9T.7X.6f.1 1720182019没有云计算应用(%)有云计算应用(%)2017-2019年中国企业云计算使用率52169310261612225632804769681295501268332.13.2H.1W.19.9E.4B.8.22.80.6 16201720182019202020212022e 2023e 2024e 2025e整体云服务市场规模(亿元)整体云服务市场增速(%)262022.12 iResearch I智能语音转写SaaS产品竞争要素转写准确度和效率、产品丰富度是核心要素综合赛道特征,艾瑞咨询评估智能语音转写SaaS产品竞争要素包含:转写准确度与效率、产品丰富度、品牌影响力、价格优势、用户体量与生态、发展潜力六个方面。从客户选择产品的角度看,虽然不同客户类型和应用场景的需求会面临一定差异,但转写准确度和效率、产品丰富度是解决用户问题的第一前提;在此基础上,有价格优势、品牌影响力大的玩家更容易受到客户青睐。此外,用户体量与生态实力强、发展潜力大的产品市场竞争优势更明显。来源:艾瑞研究院根据公开资料自主研究绘制。智能语音转写SaaS产品竞争要素转写准确度与效率产品丰富度品牌影响力价格优势用户体量与生态发展潜力指该产品支持应用场景(会议、会展、同传、字幕等)、行业领域(金融、教育、零售、客服等)、产品形态(网页、APP、API/SDK调用、智能硬件等)的覆盖情况指该品牌产品的内/外部调用量和下游用户类型广度(消费者、企业级、政府客户等)指不同收费模式下的产品单位价格;C端消费者相对价格敏感度高评价产品功能水平的直接指标。除核心的语音识别准确率外,上下文纠正、语气词过滤、角色分离、热词设置等智能化功能可提高转写服务准确度指品牌开拓市场、占领市场、并获得利润的能力,核心评价维度来源于厂商端及用户端对品牌的直接评价及认可指该品牌产品的未来市场空间。基于其技术实力、产品化能力、服务水平及发展战略综合评估品牌影响力产品丰富度转写准确度与效率价格优势用户体量与生态发展潜力竞争要素转写准确度与效率和产品丰富度为满足各类用户需求的核心要素L1L2L3注:根据行业调研厂商表现,将竞争要素对应进行L1/L2/L3级评分272022.12 iResearch I智能语音转写SaaS产品竞争格局市场头部聚集效应显著,参与者致力差异化深耕现阶段,我国智能语音转写产品市场较为集中,讯飞听见和搜狗听写的头部效应明显;但在产品同质化压力下,参与厂商也均积极在转写的各细分专业领域、云端及本地化服务形式、附加产品形态与产品生态多角度进行差异化深耕。根据六大竞争要素,艾瑞咨询将市场上提供智能语音转写SaaS服务的厂商分为三个梯队,其中语音技术厂商讯飞听见和搜狗听写位列第一梯队。讯飞听见在转写准确度尤其是针对小语种和方言等、产品丰富度、品牌影响力、发展潜力维度拔得头筹。来源:艾瑞研究院根据公开资料自主研究绘制。智能语音转写SaaS产品竞争格局第一梯队第二梯队第三梯队长尾厂商厂商在各竞争维度优势明显具有一定的品牌影响力,或深耕C端用户运营推广,或依托品牌自有用户生态,或依托下游开发者生态推广相关业务。具备一定生态优势,但在多语种、方言等场景下的转写准确率可做进一步提升。受限于转写能力、产品丰富度等因素,客户市场份额较小。拥有一定价格优势,但在其余维度表现多有所不足。品牌影响力产品丰富度转写准确度与效率价格优势用户体量与生态发展潜力讯飞听见在转写准确度、产品丰富度、品牌影响力、发展潜力维度拔得头筹。282022.12 iResearch I智能语音转写本地部署解决方案产品服务升级,高安全性与定制化满足大型政企客户需求为满足大型企业及政府客户对安全性和定制化的需求,智能语音转写SaaS厂商升级产品和服务,提供私有化部署形式和软硬一体的产品解决方案。1)本地部署的纯软件解决方案与SaaS产品的功能类似,但私有化部署的独立服务器形式可保证客户对数据保密的安全性需求且架构自主;同时,语音转写能力提供商可针对客户提供的特定语料进行模型训练,满足客户的定制化转写需求,贴合用户业务场景,计算和执行效率更高。2)为了满足政企大客户的会议室、展会、传媒编辑等线下场景的智慧办公需求,软硬一体的语音转写解决方案可打包提供定制化拾音功能硬件、多语种语音转写与翻译能力、软硬一体化开发接口等;对于随身携带且有隐私要求的离线转写场景,一体机形式的语音转写设备则将硬件拾音、软件与服务集成在一起,无需联网,即开即用。来源:艾瑞研究院根据公开资料自主研究绘制。智能语音转写本地部署解决方案特点智慧屏会议系统办公专网提供的私有云固定会场的服务器部署u 产品服务升级支持离线转写场景架构自主数据保密安全需求定制化语料训练软硬一体的一站式方案移动办公的离线单机版产品私有化部署形式线下软硬一体产品292022.12 iResearch I智能语音转写产业的飞轮模型API经济与垂类解决方案共拓产业广度与深度平台类厂商开放平台API经济可拓展智能语音转写产业的广度,形成平台效应,利用下游开发者的创新活性带动市场发展,随开发者生态聚集带来庞大的下游规模经济效益;同时,垂类解决方案则延伸产业深度,聚焦刚需应用与高价值环节,延伸出了录音笔等智能硬件、协同办公会议应用、提取长时语音信息有效内容等多条增量建设与运营需求业务线。API经济与垂类解决方案两者合力,相辅相成,形成智能语音转写产业既快且稳的增长飞轮。在此基础上,高生态活性加硬解决方案实力的企业更能突出重围,抢占市场。来源:艾瑞研究院根据公开资料自主研究绘制。智能语音转写产业的飞轮模型深度垂类解决方案核心竞争力软硬一体占据高价值环节,形成应用流量入口:围绕语音转写需求场景的核心痛点,录音笔、智慧屏、智慧会议系统等入口级智能硬件可延伸出多条增量建设与运营需求业务线,提供想象空间刚需高频应用增肌造血:为转写技术找到可打磨的场景,如协同办公、电商直播等,结合场景Know-How反哺技术研发,形成良性闭环API产业活力与不设限空间规模效益与高毛利:SaaS产品利用率更高、单位成本降低。轻量化的输出模式可以持续低成本、短账期促进营收增长平台效应:聚合合作伙伴,扩大影响力并实现语音转写技术下沉,塑造产业生态保持活性:构建动态更新的产品服务池,利用偏C端活性带动B、G端需求,拓宽企业级客户增长广度业务飞轮30智能语音转写行业发展背景篇1智能语音转写行业市场分析篇2智能语音转写行业典型企业案例3智能语音转写行业发展趋势篇45312022.12 iResearch I讯飞听见科大讯飞成立于1999年,是亚太地区知名的智能语音与人工智能上市企业,讯飞听见是科大讯飞旗下主打“AI 办公”的子品牌,为客户提供以语音转文字及多语种翻译为核心功能的智慧办公服务。依托公司深耕多年的自然语言处理、声纹识别、语音识别、翻译等核心技术,讯飞听见的产品化能力也愈发成熟,打磨出平台服务、会展传媒服务、智能硬件产品、行业解决方案四条核心产品线,布局逐步完善,覆盖广泛下游应用场景,助力C端、B端及G端提升工作效率,实现高效知识管理。来源:艾瑞研究院根据公开资料、公司官网自主研究绘制。讯飞听见转写产品线科大讯飞旗下“AI 办公”品牌,聚焦语音转写及翻译市场、平台服务 聚焦服务办公领域,在会议纪要整理、远程视频会议、跨国语言交流等场景,助力力企业高效完成办公系统智能化升级。智能硬件产品 AI加持,软硬件一体,以转写文字及翻译为核心功能的智能硬件,无缝连接讯飞听见网站、App、客户端,支持多种语言、方言,可有效提升学生在校学习和职场人办公记录效率。会展传媒服务 提供“采编播审存”一整套流程的产品;为长短视频剪辑工作者提供字幕转写产品;为会展行业提供线下一体机、线上SaaS服务的同传服务;基于转写服务为会展传媒行业打造可持续的AI应用生态圈。行业解决方案 以语音识别、机器翻译、语义理解、OCR识别等能力为基础,萃取“非结构化数据”,拓展数据维度,构建知识管理体系,辅助高效决策。为政府、企业用户打造贯通会前、会中、会后的智慧办公解决方案。讯飞听见(转写)讯飞听见翻译讯飞听见会议讯飞听见同传讯飞听见字幕讯飞听见媒体解决方案录音笔麦克风智慧屏讯飞听见智能会议系统讯飞听见智慧办公室解决方案多终端服务(PC/Web/APP/小程序)软硬件协同场景化服务多领域词库AI智能处理人机耦合时间码自动匹配多语种字幕专业级录音实时同步编辑免费转写服务软硬件一体化开发接口支持公有云和私有化部署322022.12 iResearch I讯飞听见让办公更高效,让生活更简单,让沟通无障碍作为科大讯飞语音转写及翻译的重要业务承接,讯飞听见在业界的语音转写准确率、产品智能化应用、多领域场景化应用、多语种和方言表现上出色,并整合平台和人工译员等资源搭建语音语言服务平台,让机器与人工实现取长补短的融合,极致发挥人机耦合效能。如今,讯飞听见生态用户破亿,覆盖用户已超越5000万,并与众多B端客户合作打造行业生态平台,共同参与公益活动,让听障人士通过文字去感受世界、与人沟通交流,通过AI语音赋能产品,建立起与听障人士沟通的桥梁。未来,讯飞听见将以更积极的态度履行品牌使命:让办公更高效,让生活更简单,让沟通无障碍。来源:艾瑞研究院根据公开资料、公司官网自主研究绘制。讯飞听见转写业务优势高识别准确率、多语种翻译、稳定丰富产品性能应用实例行业生态伙伴公益行动生态共荣,开放API能力接口,服务生态合作伙伴。听见AI的声音:与中国聋协残疾人艺术团联合发起听障关怀公益“听见AI的声音”,累计为用户捐赠时长6000万分钟。B站无障碍直播间字幕:观看英雄联盟S11、2022英雄联盟MSI和2022英格兰足总杯活动。转写精准语种丰富会议纪要智能化场景化隐私安全全链路多终端产品,客户类型多元 准确率97.5%,1小时音频最快5分钟出稿。支持10种国家语言转写、12种地方方言、2种少数民族语。会议内容实时转写,边录边转;会议信息快速整理,清晰明了;关键内容实时标记,一键定位。智能纠错、语气词过滤 角色分离:智能区分说话人,标记多角色,快速整理稿件 根据不同行业客户,提供16个行业词库 适配不同客户需求,支持音视频、文档、链接等多格式 通过可信云认证,信息加密全程保证 硬软件一体、行业定制解决方案定制、私有化部署等。客户覆盖职场个人、政府企业、文化传媒等。同时搭建语音语言服务平台,整合AI语音产品及人工服务提升人机耦合服务效能。私有化转写翻译服务为客户提供私有化转写翻译服务。332022.12 iResearch I火山引擎服务于字节系产品,短视频字幕生成用户生态体量大火山引擎的语音识别能力基于深度学习技术,可将音频中的语音转成文字,用于识别多种音频编码格式、多种场景和不同长短的语音,广泛应用于音视频字幕生成、会议访谈转写、呼叫中心录音质检、课堂内容分析等场景。其智能字幕生成服务可用于辅助视频字幕创作和外挂字幕生成。产品支持多个语种的语音识别、歌词识别和字幕打轴,可结合语音停顿和自然语言的语义信息,全自动判断说话或唱歌,输出流畅自然的分句结果,适配视频剪辑、网课、视频会议等多种场景的智能字幕生成。有效提高视频内容生产者的积极性,降低视频内容处理成本。来源:艾瑞根据公开资料研究绘制。服务稳定准确率支持语种丰富企业级稳定服务保障,专有集群,大流量并发,高效灵活,可快速返回识别结果采用端到端语音识别框架,与抖音、飞书、剪映、西瓜视频等业务深度合作,具备实际业务场景打磨的丰富经验,确保准确率广泛应用于泛娱乐、办公、教育、客服场景,支持了汽车、智能金融、银行、保险、证券、运营商、物流、房地产等众多垂直领域多语种识别,支持中英日韩等多国语言及地区方言的识别多领域覆盖火山引擎语音转写服务特点与主要客户342022.12 iResearch I灵云听语灵云平台推出的以语音转文字为核心的云服务平台灵云听语是由捷通华声开发的一款专注语音识别转写的智能化应用。由灵云听语网页版和灵云听语App版组成,可分享相同账号,数据联通。网页版能够将音频转写结果以普通文本或字幕格式导出,支持在线编辑;App版则支持手机实时录音边说边转和导入音频文件转写识别。灵云听语支持多种音频格式,使用场景丰富,支持中文、英文、方言识别转写。中文转写覆盖13种专业领域,广泛用于办公会议、录音整理、访谈演讲、课程学习、记者采访、视频字幕制作等场景。来源:艾瑞根据公开资料研究绘制。转写服务覆盖13种专业领域通用聊天电话客服教育学习金融财经政党会议恋爱心理哲学思想广播电台企业办公旅游景点网课教学医疗健康国学历史实时转写响应速度快至500毫秒;非实时转写1小时音频文件只需5-10分钟语音识别速度快超大容量多种音频格式多语种mp3/wav/m4a/amr/mp4/flv/mov/avi格式支持中、英、方言识别和中、英、数字混合输入单条大小不超过5G,时长小于3小时灵云听语语音转写功能介绍352022.12 iResearch I录音转文字助手支持手机端和网页端服务,主要服务于C端用户录音转文字助手是由上海动起信息科技有限公司开发,可应用于安卓、苹果手机、iPad、网页端通用的一款将语音转文字、录音转文字、音频文件转文字并翻译记录的软件,适用于会议,采访,讲座,课堂,出国旅游,英语学习等各种场合。该应用依托迅捷语音的核心语音识别技术,提供视频转文字、图片转文字、合成主播等文字转语音应用,主要面向各行业C端用户。来源:艾瑞根据公开资料研究绘制。录音转文字助手业务布局与转写专业领域录音转文字翻译文字转语音快速转换文字,方便进行拷贝和编辑等后续的工作。适用于转写会议记录、电影对白、新闻媒体、情感写作等多个情景,提高办公效率,专注生产力的提升。亦提供人工精转服务提供简体中文、英文、阿拉伯语、德语、法语、葡萄牙语、西班牙语、意大利语、韩语的互译服务,支持中英文实时对话翻译高辨识度的语音合成功能,模拟真人发声,让文字信息变得绘“声”绘色。如广告叫卖、专题宣传、课件培训、方言配音、英语配音等。可以自定义主播参数的设置,如音量、语速、语调,来调节达到更适合使用场景的发音转写专业领域通用聊天会议办公教育培训情感写作新闻媒体IT科技36智能语音转写行业发展背景篇1智能语音转写行业市场分析篇2智能语音转写行业典型企业案例3智能语音转写行业发展趋势篇4372022.12 iResearch I技术趋势来源:艾瑞研究院根据公开资料与专家访谈自主研究绘制。应用价值提升仍受技术掣肘,转写场景有望进一步泛化智能语音转写的技术难点方言语种环境噪音多人声道如何提升语音识别鲁棒性?收集大量真实环境的语音数据进行带噪训练,需付出大量精力成本,且由于真实环境复杂多变,难以覆盖所有应用场景。采用单通道、麦克风阵列、机器学习模型、深度卷积模型自适应等语音增强方法,尽可能减弱背景噪声影响。当下语音识别技术的精度和速度仍取决于实际应用环境,在常见语种、标准口音、安静环境下的语音识别情况已达到了可规模化应用状态。但现实应用场景随机性极高,面对“混合语种”“嘈杂环境”下的“多人”“交互”“重叠”等多重因素交织的复杂语音场景,语音技术尚未能很好地处理这些问题。如今,语音转写应用多限制在办公会议、视频直播等部分较为理想环境下的固定场景,下一代语音识别技术的突破创新有望实现转写场景泛化升级,进一步抬升语音技术的应用价值与潜力空间。近场环境远场环境达到高识别准确率,甚至超过人类水平无噪音识别准确率略有降低,可规模化应用轻微噪音识别准确率将大幅下降在传播过程中,声波能量随传播距离呈指数衰减,语音信号受到噪声和混响的干扰更加严重鸡尾酒会问题:周围多人同时说话时,如何识别每个人的说话内容?众多汉语方言识别除中英应用广泛外的小众语种识别多语种混合识别(例:中英粤)如何区别不同说话人的语音转写内容?硬件层面:多麦板卡,基于硬件实现说话人分离目的算法层面:传统聚类算法,在说话人数量少,且无重叠语音等简单场景下,能够取得较好的效果;引入声纹识别,需提前录入说话人声纹达到说话人分离效果,限制应用场景;应用端到端语音分离模型,分离不同角色语音信号,将角色标签的指派问题,转化为目标说话人的语音检测问题,基于角色特性不断优化模型。如何解决方言及小语种的识别覆盖范围?尽可能收集方言及小语种的数据集语料进行语言模型训练解决低资源问题,通过少量数据资源解决方言,小语种识别问题如何解决多语种识别问题?通用建模:将不同语种的建模单元映射成同一套建模单元体系多语种混合模型:不同语种共享一个隐层神经网络,各自有独立的一个输出分类层382022.12 iResearch I场景价值基于产品生态圈,多维度延伸转写技术的内容价值链从产品生态圈来看,智能语音转写既可以作为单独功能产品出现,也可将转写模块嵌入到各个产品及应用领域中,将语音内容沉淀为文字资产,与更多应用形成内容联动,进一步拓展转写服务的技术优势与场景价值,打造连接转写应用生态的良性循环;此外,如今智能转写应用领域大多仅服务于从语音到文字转写内容的实现,而从内容价值链来看,未来转写应用可结合自然语言理解、机器学习、知识图谱等AI技术,拓展转写产品的场景边界,深入挖掘转写内容价值,在沉淀文字内容基础上,自主生成优化策略,以更高阶、智能的辅助替代角色,为客户提供问题预警、策略总结、决策分析等功能服务。目前可代表的典型场景为客服内容质检,但未来转写内容的分析挖掘在销售对话、办公内容洞察、视频内容分析、主播话术策略等领域有更加广阔的商业化前景。来源:艾瑞研究院自主研究绘制。智能语音转写产品发展方向 办公场景 音视频场景 交流场景 内容分析 将转写功能嵌入更多办公产品应用,形成内容联动及智能提取,提升办公效率 结合NLP及知识图谱技术进行转写内容的信息挖掘及深入分析 从web端、PC端、APP端提升转写功能可触达性,优化语种、方言的技术能力 赋能更多第三方音视频产品,开放转写功能模块,构建音视频产品AI应用生态 顺应会展两线融合趋势,提供线上会展字幕及翻译功能 泛化交流场景受众,赋能更多交流工具,打破方言、语种的语言壁垒 将语音转写功能开放给更多产品模块,将语音转为文字资产保留 开发文字资产价值,对转写内容进行深入分析,为公司提供高价值的决策依据392022.12 iResearch I厂商策略构建自身产品生态,加强外部场景合作顺应智能语音转写市场的需求释放,各家参与厂商将持续开展差异化竞争策略,在转写市场找到适合自身情况的角色定位,共同推进转写技术的应用渗透与市场发展。早期,智能语音厂商选择率先构建硬件生态,以硬件产品“创造”更多转写应用场景,快速获得C端流量入口与品牌认知,随后不断加强软件服务及生态能力。未来,智能语音厂商将在保证自身技术创新力与先进性的基础上,集中发力内部软硬件生态的合力构建;云服务厂商将持续保持对内嵌入转写功能、对外开放转写能力的双边策略,发挥自身平台优势,更多以提供底层能力服务的赋能者活跃市场;专业转写及翻译厂商将继续聚焦垂类场景,以转写及翻译能力为核心产品,以客户需求为导向,丰富软件产品的功能模块,加强构建更完善、更具业务理解的软件生态。来源:艾瑞研究院自主研究绘制。智能语音转写厂商策略构建硬件生态构建软件生态内部策略内部策略外部策略开展外部合作通过硬件产品开发创造更多转写服务的应用场景硬件产品一般选择与外部方合作,但转写厂商若具备硬件设计能力,可优化硬件中的拾音模块,提升转写识别准确率以转写能力为核心产品,开发对应软件产品将转写能力嵌入到现有软件产品中,将转写功能成为产品模块中的一项,优化用户在办公领域、音视频领域的使用体验。以API或SDK的接口形式将语音转写能力开放出去,为生态合作伙伴提供语音能力的集成化服务,无需自身投入大量精力实现以转写功能为核心的产品化。“厂商核心关注点即为转写服务的依托场景,如何通过内外部策略提升转写技术的商业价值”40行业咨询投资研究市场进入竞争策略IPO行业顾问募投商业尽职调查投后战略咨询为企业提供市场进入机会扫描,可行性分析及路径规划为企业提供竞争策略制定,帮助企业构建长期竞争壁垒为企业提供上市招股书编撰及相关工作流程中的行业顾问服务为企业提供融资、上市中的募投报告撰写及咨询服务为投资机构提供拟投标的所在行业的基本面研究、标的项目的机会收益风险等方面的深度调查为投资机构提供投后项目的跟踪评估,包括盈利能力、风险情况、行业竞对表现、未来战略等方向。协助投资机构为投后项目公司的长期经营增长提供咨询服务艾瑞新经济产业研究解决方案41艾瑞咨询是中国新经济与产业数字化洞察研究咨询服务领域的领导品牌,为客户提供专业的行业分析、数据洞察、市场研究、战略咨询及数字化解决方案,助力客户提升认知水平、盈利能力和综合竞争力。自2002年成立至今,累计发布超过3000份行业研究报告,在互联网、新经济领域的研究覆盖能力处于行业领先水平。如今,艾瑞咨询一直致力于通过科技与数据手段,并结合外部数据、客户反馈数据、内部运营数据等全域数据的收集与分析,提升客户的商业决策效率。并通过系统的数字产业、产业数据化研究及全面的供应商选择,帮助客户制定数字化战略以及落地数字化解决方案,提升客户运营效率。未来,艾瑞咨询将持续深耕商业决策服务领域,致力于成为解决商业决策问题的顶级服务机构。400-026-联系我们 Contact Us企 业 微 信微 信 公 众 号关于艾瑞42法律声明版权声明本报告为艾瑞咨询制作,其版权归属艾瑞咨询,没有经过艾瑞咨询的书面许可,任何组织和个人不得以任何形式复制、传播或输出中华人民共和国境外。任何未经授权使用本报告的相关商业行为都将违反中华人民共和国著作权法和其他法律法规以及有关国际公约的规定。免责条款本报告中行业数据及相关市场预测主要为公司研究员采用桌面研究、行业访谈、市场调查及其他研究方法,部分文字和数据采集于公开信息,并且结合艾瑞监测产品数据,通过艾瑞统计预测模型估算获得;企业数据主要为访谈获得,艾瑞咨询对该等信息的准确性、完整性或可靠性作尽最大努力的追求,但不作任何保证。在任何情况下,本报告中的信息或所表述的观点均不构成任何建议。本报告中发布的调研数据采用样本调研方法,其数据结果受到样本的影响。由于调研方法及样本的限制,调查资料收集范围的限制,该数据仅代表调研时间和人群的基本状况,仅服务于当前的调研目的,为市场和客户提供基本参考。受研究方法和数据获取资源的限制,本报告只提供给用户作为市场参考资料,本公司对该报告的数据和观点不承担法律责任。合作说明该报告由讯飞听见和艾瑞共同发起,旨在体现行业发展状况,供各界参考。

    浏览量101人已浏览 发布时间2022-12-15 43页 推荐指数推荐指数推荐指数推荐指数推荐指数5星级
  • MobiDev:如何有效搭建语音识别系统(2022)(英文版)(15页).pdf

    Table of ContentsVOICE RECOGNITION VS SPEECH RECOGNITIONHow do speech recognition applications work?WHICH TYPE OF AI IS USED IN SPEECH RECOGNITION?WHAT IS IMPORTANT FOR SPEECH RECOGNITION TECHNOLOGY?Automatic Speech Recognition process and componentsOur 4 recommendations for improving quality of ASR1.PAY ATTENTION TO THE SAMPLE RATE2.NORMALIZE RECORDING VOLUME3.IMPROVE RECOGNITION OF SHORT WORDS4.USE NOISE SUPPRESSION METHODS ONLY WHEN NEEDEDGet the enhanced ASR system1Modern voice applications use AI algorithms to recognize different sounds,including human voice and speech.In technical terms,most of the voice appsperform either voice recognition or speech recognition.And while there is no bigdifference between the architecture and AI models that perform voice/speechrecognition,they actually relate to different business tasks.So first of all,let uselaborate on the difference between them.VOICE RECOGNITION VS SPEECH RECOGNITIONVoice recognition is the ability to single out specific voices from other sounds,and identify the owners tone to implement security features like voicebiometrics.Speech recognition is mostly responsible for extracting meaningful informationfrom the audio,recognizing the words said,and the context they are placed in.With this we can create systems like chatbots and virtual assistants forautomated communication and precise understanding of voice commands.Both terms can often be used interchangeably,because there is not muchtechnical difference between the algorithms that perform these functions.Although,depending on what you need,the pipeline for voice or speechrecognition may be different in terms of processing steps.If you are interestedin voice recognition for security systems specifically,read our article on AI voicebiometrics:In this post,well focus on the general approach for speech recognitionapplications,and elaborate on some of the architectural principles we can applyto cover all of the possible functional requirements.2How do speech recognition applicationswork?Speech recognition covers the large sphere of business applications rangingfrom voice-driven user interfaces to virtual assistants like Alexa or Siri.Anyspeech recognition solution is based on the Automatic Speech Recognition(ASR)technology that extracts words and grammatical constructions from the audio,to process it and provide some type of system response.WHICH TYPE OF AI IS USED IN SPEECH RECOGNITION?Speech recognition models can react to speech directly as an activation signalfor any type of action.But since were speaking about speech recognition,it isimportant to note that AI doesnt extract meaningful information right from theaudio,because there are many odd sounds in it.This is where speech-to-textconversion is done as an obligatory component to further apply NaturalLanguage Processing or NLP.So the top-level scope of a speech recognition application can be represented asfollows:the users speech provides input to the AI algorithm,which helps to findthe appropriate answer for the user.High-level representation of an automatic speech recognition applicationHowever,it is important to note that the model that converts speech to text forfurther processing is the most obvious component of the entire AI pipeline.Besides the conversion model,there will be numerous components that ensureproper system performance.So approaching the speech recognition system development,first you mustdecide on the scope of the desired application:What will the application do?3Who will be the end users?What environmental conditions will it be used in?What are the features of the domain area?How will it scale in the future?WHAT IS IMPORTANT FOR SPEECH RECOGNITION TECHNOLOGY?When starting speech recognition system development,there are a number ofbasic audio properties we need to consider from the start:1.Audio file format(mp3,wav,flac etc.)2.Number of channels(stereo or mono)3.Sample rate value(8kHz,16kHz,etc.)4.Bitrate(32 kbit/s,128 kbit/s,etc.)5.Duration of the audio clips.The most important ones are audio file format and sample rate,so lets speak ofthem in detail.Input devices record audio in different file formats,and mostoften audio is saved in loosy mp3,but there can also be lossless formats likeWAV or Flac.Whenever we record a sound wave,we basically digitize the soundby sampling it in discrete intervals.This is whats called a sample rate,whereeach sample is an amplitude of a waveform in a particular duration of time.Audio signal representation4Some models are tolerant to format changes and sample rate variety,whileothers can intake only a fixed number of formats.In order to minimize this kindof inconsistency,we can use various built-in methods for working with audio ineach programming language.For example,if we are talking about the Pythonlanguage,then various operations such as reading,transforming,and recordingaudio can be performed using the libraries like Librosa,scipy.io.wavfile andothers.Once we get the specifics of audio processing,this will bring us to a more solidunderstanding of what data well need,and how much effort it will take toprocess it.At this stage,consultancy services from a data science teamexperienced in ASR and NLP is highly recommended.Since gathering wrong dataand or setting unrealistic objectives are the biggest risks in the beginning.Automatic Speech Recognition process andcomponentsAutomatic speech recognition,speech-to-text,and NLP are some of the mostobvious modules in the whole voice-based pipeline.But they cover a very basicrange of requirements.So now lets look at the common requirements to speechrecognition,to understand what else we might include in our pipeline:The application has to work in background mode,so it has to separate theusers speech from other sounds.For this feature,well need voice activitydetection methods,which will transfer only those frames that contain thetarget voice.The application is meant to be used in crowded places,which means therewill be other voices and surrounding noise.Background noise suppressionmodels are preferable here,especially neural networks which can removeboth low-frequency noise,and high frequency loud sounds like humanvoices.In cases where there will be several people talking,like in the case of a callcenter,we also want to apply speaker diarization methods to divide theinput voice stream into several speakers,finding the required one.The application must display the result of voice recognition to the user.Then it should take into account that speech2text(ASR)models may5return text without punctuation marks,or with grammatical mistakes.Inthis case,it is advisable to apply spelling correction models,which willminimize the likelihood that the user will see a solid text in front of them.The application will be used in a domain area,where professional termsand abbreviations are used.In such cases,there is a risk that speech2textmodels will not be able to correctly cope with this task and then trainingof a custom speech2text model will be required.In this way,we can derive the following pipeline design which will includemultiple modules just to fetch the correct data and process it.Automatic Speech Recognition(ASR)pipelineThroughout the AI pipeline,there are blocks that are used by default:ASR andNLP methods(for example,intent classification models).Essentially,the AIalgorithm takes sound as an input,converts it to speech using ASR models,andchooses a response for the user using a pre-trained NLP model.However,for a6qualitative result,such stages as pre-processing and post-processing arenecessary.Now well move to advanced architecture.Our 4 recommendations for improvingquality of ASRTo optimize the planning of the development and mitigate the risks before youget into trouble,it is better to know of the existing problems within the standardapproaches in advance.MobiDev ran an explicit test of the standard pipeline,soin this section will share some of the insights found that need to be considered.1.PAY ATTENTION TO THE SAMPLE RATEAs weve mentioned before,audio has characteristics such as sample rate,number of channels,etc.These can significantly affect the result of voicerecognition and overall operation of the ASR model.In order to get the bestpossible results,we should consider that most of the pre-trained models weretrained on datasets with 16Hz samples and only one channel,or in other words,mono audio.This brings with it some constraints on what data we can take for processing,and adds requirements to the data preparation stages.2.NORMALIZE RECORDING VOLUMEObviously,ASR methods are sensitive to audio containing a lot of extraneousnoise,and suffer when trying to recognize atypical accents.But whats moreimportant,speech recognition results will strongly depend on the sound volume.Sound recordings can often be inconsistent in volume due to the distance fromthe microphone,noise suppression effects,and natural volume fluctuations inspeech.In order to avoid such inaccuracies,we can use the Pyloudnorm libraryfrom the Python language that helps to determine the sound volume range andamplify the sound without any distortion.This method is very similar to audiocompression,but brings less artifacts,improving the overall quality of themodels predictions.7Nvidia Quarznet 155 speech recognition results with and without volumenormalizationHere you can see an example of voice recognition without volume normalization,and also with it.In the first case,the model struggled to recognize a simpleword,but after volume was restored,the results improved.3.IMPROVE RECOGNITION OF SHORT WORDSThe majority of ASR models were trained on datasets that contain texts withproper semantic relations between each sentence.This brings us to anotherproblem with recognizing short phrases taken out of context.Below is acomparison of the performance of the ASR model on short words taken out ofcontext and on a full sentence:The result of recognizing short words in and out of context8In order to overcome this problem,it is necessary to think about the use of anypreprocessing methods that allow the model to understand in which particulararea a person wants to receive information more accurately.Additionally,ASR models can generate non-existing words and other specificmistakes during the text to speech conversion.Spell correction methods maysimply fail in the best cases,or choose to correct the word to one that is close tothe right choice,or even change to a completely wrong one.This problem alsoapplies to very short words taken out of context,but it should be foreseen inadvance.4.USE NOISE SUPPRESSION METHODS ONLY WHEN NEEDEDBackground noise suppression methods can greatly help to separate a usersspeech from the surrounding sounds.However,once loud noise is present,noise suppression can lead to another problem,such as incorrect operation ofthe ASR model.Human speech tends to change in volume depending on the part of thesentence.For example,when we speak we would naturally lower our voice atthe end of the sentence,which leads to the voice blending with other soundsand being drowned out by the noise suppression.This results in the ASR modelnot being able to recognize a part of the message.Below you can see anexample of noise suppression affecting only a part of a users speech.9Noise suppression effect on the speech recognitionIt is also worth considering that as a result of applying Background NoiseSuppression models,the original voice is distorted,which adversely affects the10operation of the ASR model.Therefore,you should not apply Background NoiseSuppression without a specific need for it.Get the enhanced ASR systemBased on the mentioned points,the initial pipeline can bring more trouble withit than actual performance benefits.This is because some of the componentsthat seem logical and obligatory may interrupt the work of other essentialcomponents.In other cases,there is a strict need to add layers of preprocessingbefore the actual AI model can interact with data.We can therefore come upwith the following enhanced ASR system architecture:Enhanced automatic speech recognition system pipelineThat is why,based on the above points,noise suppression and spellingcorrection modules were removed.Instead,to solve the problem of removingnoise and getting rid of errors in the recognized text,the ASR model has to be11fine-tuned on the real data.This data will fully reflect the actual environmentalconditions and features of the domain area.As you can see,speech recognition application development has many pitfallswhich require strong software engineering and machine learning expertise to beconsidered.At MobiDev,AI engineers developed a solid practical experienceduring the projects and R&D on voice applications.If you have a project idea thatinvolves speech recognition or any type of related technology,contact us todiscuss the details.12

    浏览量17人已浏览 发布时间2022-11-15 15页 推荐指数推荐指数推荐指数推荐指数推荐指数5星级
  • 世界经济论坛(WEF):2022年人脸识别用例责任限制政策框架-执法调查报告(英文版)(43页).pdf

    A Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement InvestigationsI N S I G H T R E P O R TR E V I S E D N O V E M B E R 2 0 2 2With particular thanks and appreciation to:A Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations2ForewordIntroductionMethodology1 Law enforcement investigations:use cases and definitions2 Principles3 Self-assessment questionnaireConclusionGlossaryContributorsEndnotes45810192734363841Images:Getty Images,Unsplash 2022 World Economic Forum,UNICRI,INTERPOL and Netherlands Police.All rights reserved.No part of this publication may be reproduced or transmitted in any form or by any means,including photocopying and recording,or by any information storage and retrieval system.Disclaimer This document is published by the World Economic Forum as a contribution to a project,insight area or interaction.The findings,interpretations and conclusions expressed herein are a result of a collaborative process facilitated and endorsed by the World Economic Forum but whose results do not necessarily represent the views of the World Economic Forum,nor the entirety of its Members,Partners or other stakeholders.ContentsA Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations3ForewordRemote biometric technologies in particular,facial recognition have gained a lot of traction in the security sector in recent years.The accuracy of these technologies has significantly increased with advancements in deep learning algorithms,growing access to huge volumes of training data and the pressure to reduce bias to negligible values.The advent of this technology comes at a time when law enforcement agencies are increasingly expected to resolve ever more complex and often transnational crimes and conduct their investigations expeditiously often with limited resources.In a field in which underperformance can be a matter of life or death,tools such as facial recognition technology can greatly benefit law enforcement investigations.But,improperly implemented or implemented without due consideration for its ramifications,facial recognition technology(FRT)could result in major abuses of human rights and cause harm to citizens,particularly those in underserved communities.Undoubtedly,the rapid adoption of FRT has raised multiple concerns,mainly related to the possibility of its potential to undermine freedoms and the right to privacy.In parallel with this,there has been a growing emphasis on putting policies in place to address and mitigate these risks.With the creation of this paper,the World Economic Forum,the International Criminal Police Organization(INTERPOL),the United Nations Interregional Crime and Justice Research Institute(UNICRI)and the Netherlands Police have built a global alliance to tackle this challenge and bring the issue of responsible use of FRT in law enforcement investigations to the international agenda.We have also engaged with a community of experts composed of governments,civil society and academia to collect their insights through a consultative process and have piloted our proposed framework with law enforcement agencies to ensure that what we propose can truly work in an operational law enforcement context.And it does.This insight report presents a set of proposed principles for the use of facial recognition in law enforcement investigations along with a self-assessment questionnaire intended to support law enforcement agencies to design policies surrounding the use of FRT and to review existing policies in line with the proposed principles.This is only the beginning of the conversation on law enforcements use of FRT,but we are confident that this unique proposed approach can be an important contribution to the law enforcement community and help to inform public debate all across the globe.We encourage law enforcement agencies and policy-makers at the national level to reflect on this paper,to participate in a dialogue on the basis of it and to review or adopt legislation that supports the responsible use of facial recognition technology.Irakli Beridze Head of the Centre for Artificial Intelligence and Robotics,UNICRIKay Firth-Butterfield Head of Artificial Intelligence and Machine Learning;Member of the Executive Committee,World Economic ForumMarjolein Smit-Arnold Bik Head of the Special Operations Division,Netherlands PoliceCyril Gout Director of Operational Support and Analysis,INTERPOLA Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement InvestigationsNovember 2022A Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations4IntroductionOver the past decade,progress in artificial intelligence(AI)and sensors has fuelled the development of facial recognition technology(FRT)software capable of matching a human face from a digital image or a video frame against a database of facial images.This has led to its rapid adoption in various industries,including law enforcement,transportation,healthcare and banking.The development of FRT presents considerable opportunities for socially beneficial uses.For instance,it can find application in face-unlock mechanisms in mobile devices,in granting access to concerts and sporting events,and in attendance-tracking for employees and students.But it also creates unique challenges.To fully grasp these challenges and the trade-offs they may entail and to build appropriate governance processes,it is necessary to approach FRT deployment by examining specific applications.Indeed,passing through an airport border control with face identification,using face-based advertising in retail or employing facial recognition solutions for law enforcement investigations involves very different benefits and risks.To ensure the trustworthy and safe deployment of this technology across domains,the World Economic Forum has spearheaded a global and multistakeholder policy initiative to design robust governance frameworks.The Forum launched the first workstream in April 2019,focusing on flow management applications1 replacing tickets with facial recognition to access physical premises or public transport,such as train platforms or airports.This workstream was concluded in December 2020 with the release of a tested assessment questionnaire by Tokyo-Narita Airport,an audit framework and a certification scheme co-designed with AFNOR Certification(Association Franaise de Normalisation).2 In November 2020,the second workstream was launched,focused on the law enforcement context supporting the identification of a person by comparing a probe image to one or multiple reference databases to advance a police investigation.While law enforcement has been using biometric data,such as fingerprints or DNA,to conduct investigations,FRT is a new opportunity and challenge for law enforcement.In terms of challenges,use by law enforcement raises multiple public concerns,primarily because of the potentially devastating effects of system errors or misuses in this domain.A study conducted in 2019 by the National Institute of Standards and Technology(NIST)showed that,although some facial recognition algorithms had“undetectable”differences in terms of accuracy across racial groups,others exhibit performance deficiencies based on demographic characteristics such as gender and race.3 Law enforcement agencies should be aware of these potential performance deficiencies and implement appropriate governance processes to mitigate them.In doing so,they would limit the risk of false positives or false negatives and possible wrongful arrests of individuals based on outputs from an FRT system.Failure to build in such processes could have dramatic consequences.For example,in 2018 in the United States,an innocent African American man was arrested and held in custody as a result of being falsely identified as a suspect in a theft investigation in which FRT was used.4 In addition to hampering rights such as the presumption of innocence,and the right to a fair trial and due process,the use of FRT by law enforcement agencies can also undermine freedom of expression,freedom of assembly and association,and the right to privacy.5 While law enforcement has been using biometric data,such as fingerprints or DNA,to conduct investigations,FRT is a new opportunity and challenge for law enforcement.A Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations5These concerns have led to intensified policy activity globally.In the US alone,some local and state governments have banned the use of FRT by public agencies,including law enforcement.Major cities such as San Francisco,6 Oakland7 and Boston8 have adopted such measures.At the state level,Alabama,9 Colorado,10 Maine,11 Massachusetts,12 Virginia13 and Washington14 have all introduced legislation to regulate its use.Finally,at the federal level,various bills15 including most recently the Facial Recognition Act of 2022,introduced in September 202216 have been proposed to regulate FRT but none of them has been adopted to this date.Furthermore,large US technology companies have also formulated positions on this topic.In the wake of a series of events in 2020 that increased distrust toward police agencies in the US and worldwide,including the Clearview AI controversy,17 IBM announced that it will no longer offer,develop or research FRT,while Microsoft pledged to stop selling FRT to law enforcement agencies in the US until federal regulation was introduced.18 In 2022,Microsoft went further,putting new limits and safeguards on all uses of FRT as part of a broader set of AI principles.19 In 2021,Amazon Web Services(AWS)also extended its moratorium on police use of its platform Rekognition,which it originally imposed in 2020.20In other jurisdictions,policy-makers are attempting to limit police use of FRT to very specific use cases associated with robust accountability mechanisms to prevent potential errors that may lead to wrongful arrests.That is the direction proposed by the European Commission,which in 2021 released its draft of an Artificial Intelligence Act21 a comprehensive regulatory proposal that classifies AI applications under four distinct categories of risk subject to specific requirements.22 This proposal includes provisions on remote biometric systems,which include FRT.It states that AI systems intended to be used for the“real-time”and“post”remote biometric identification of natural persons represent high-risk applications and would require an ex ante conformity assessment of tech providers before getting access to the European Union market and an ex post conformity assessment while their systems are in operation.Moreover,“real-time”remote biometric identification systems in publicly accessible spaces for the purpose of law enforcement are prohibited unless they serve very limited exceptions related to public safety(e.g.the prevention of imminent terrorist threats or a targeted search for missing persons).In order to enter into force,however,the European Commissions proposal will first need to be adopted by the European Union parliament and the Council of the European Union.At the United Nations,a similar approach is emerging,with the Office of the High Commissioner for Human Rights(OHCHR)presenting a report23 in 2021 to the Human Rights Council on the right to privacy in the digital age,in which it recommends banning AI applications that cannot be used in compliance with international human rights law.With specific respect to the use of FRT by law enforcement,national security,criminal justice and border management,the report stated that remote biometric recognition dramatically increases the ability of State authorities to systematically A Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations6 In addition to providing practical guidance and support to law enforcement and policy-makers,this governance framework seeks to inform public debate on the use of FRT.identify and track individuals in public spaces,undermining the ability of people to go about their lives unobserved and resulting in a direct negative effect on the exercise of the rights to freedom of expression,of peaceful assembly and of association,as well as freedom of movement.The report also reiterates calls for a moratorium on the use of remote biometric recognition in public spaces,at least until authorities can demonstrate that there are no significant issues with accuracy or discriminatory impacts,and that these AI systems comply with robust privacy and data protection standards.Courts have also started to play an important role in shaping the policy agenda on FRT.In 2021,the So Paulo Court of Justice in Brazil blocked24 the deployment of FRT in the public transport system.This was perceived as a major victory by civil rights organizations that oppose the increasing use of FRT by public agencies.In a similar case in the UK,while the Court of Appeal found that the deployment of automated FRT by the police did have a legal basis for use in common law,its use by the South Wales Police at certain events and public locations was unlawful because it did not sufficiently define who could be on a watch list and where it could be used.25In some countries,governments have adopted a cautious approach.That has been the case in the Netherlands.In 2019,the Minister of Justice and Security addressed a letter to members of parliament informing them about the existing uses of FRT by law enforcement agencies and reaffirming his support for robust governance processes in relation to this sensitive technology.26 Further,he argued that the existing legal framework and safeguards,both technical and organizational,are sufficiently robust to ensure the responsible use of FRT by law enforcement agencies.Nevertheless,he requested additional privacy,ethical and human rights impact assessments before authorizing any further uses or pilots of FRT.Despite these developments,most governments around the world continue to grapple with the challenges presented by FRT.The ambition of this work is thus to strengthen their efforts to overcome them,and support law-and policy-makers across the globe in designing an actionable governance framework that addresses the key policy considerations raised,such as the necessity of a specific purpose,the performance assessment of authorized solutions,the procurement processes for law enforcement agencies,the training of professionals and the maintenance of the chain of command for emergency situations.To achieve this,the World Economic Forum,the International Criminal Police Organization(INTERPOL),the United Nations Interregional Crime and Justice Research Institute(UNICRI)and the Netherlands Police convened a multistakeholder community centred on co-designing a set of principles that outline what constitutes the responsible use of FRT for law enforcement investigations.These principles are accompanied by a self-assessment questionnaire to support law enforcement agencies to design policies surrounding the use of FRT and to review existing policies in line with the proposed principles.In addition to providing practical guidance and support to law enforcement and policy-makers,this governance framework seeks to inform public debate on the use of FRT at the national,regional and international levels and provide an actionable framework to maximize the benefits of FRT while mitigating its risks.While the policy framework proposed in this paper is not the only such policy guidance in this domain,it seeks to present a unique proposal built with an international perspective,incorporating a multistakeholder approach,including law enforcement,industry and civil society,in its development.A Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations7MethodologyFor the past three years,the artificial intelligence/machine learning(AI/ML)platform of the World Economic Forum has been conducting a policy initiative on the governance of FRT.The objective of this initiative is to create an appropriate space for conversation to advance the drafting of policies related to the use of this biometric technology.The methodology,in essence,consists of a core community of partners and an extended global community of experts co-leading the development of a pilot project.This pilot-based approach to policy-making has been adopted as it is considered to have the potential to better inform and guide law enforcement users and policy-makers seeking to ensure the appropriate governance of FRT.A multistakeholder approach based on a core community and a project communityThe initiative brought the World Economic Forum together with INTERPOL and the Netherlands Police both users of FRT and UNICRI,a United Nations entity mandated to support United Nations Member States in formulating and implementing improved policies in the fields of crime prevention and criminal justice.With the objective of proposing a policy framework,this core community gathered virtually on a weekly basis between January 2021 and October 2022.The core community additionally organized consultations with an extended group of stakeholders the project community to further benefit from broader expertise and insights.A total of 64 individuals participated in this project community,representing technology companies,governmental and international organizations,civil society and academia.The first consultation with the project community was a workshop,organized in February 2021,which kicked off the project and sought to gain insights regarding the risks related to the use of FRT by law enforcement and the potential solutions to mitigate them.The second consultation was a request for comments on the draft of the principles for the responsible use of FRT for law enforcement investigations.The project community was allocated a month to share any comments on the proposal.Following this,four expert interviews were organized to gather additional insights.In total,10 organizations and experts from the project community shared comments on the draft,which the core community incorporated into a revised draft of the principles.The whole project was conducted under the Chatham House Rule,whereby participants are free to use the information received,but neither the identity nor the affiliation of the speaker(s),nor that of any other participant,may be revealed.27This policy framework comprises two elements:The principles,and their corresponding actions,which aim to define what constitutes the responsible use of FRT in law enforcement investigations.This list of nine principles was drafted by the core community composed of INTERPOL,UNICRI,the Netherlands Police and the World Economic Forum.The self-assessment questionnaire,which complements the principles and is intended to support practitioners in the law enforcement community in effectively implementing these proposed principles.Law enforcement agencies already using FRT are encouraged to use the questionnaire to review their existing processes and assess the alignment of their approach with the proposed principles.The self-assessment questionnaire can also be used by agencies that do not currently have FRT in operation but which have the ambition to develop the capability.In this regard,it can be used as a guide to help them reflect upon the necessary steps to develop their capabilities responsibly and review their processes as they develop them.A policy framework composed of principles and a self-assessment questionnaire A total of 64 individuals participated in this project community,representing technology companies,governmental and international organizations,civil society and academia.A Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations8In October 2021,the first draft of the policy framework was publicly released,bringing the initial developmental phase of the project to a conclusion.The next phase of the project was launched in January 2022,focusing on piloting the policy framework.The pilot was intended to collect feedback from the pilot members in order to review and validate the utility and completeness of the principles and the self-assessment questionnaire,assessing it as a system and tool for law enforcement to ensure the trustworthy and safe deployment of FRT.Feedback from participating agencies on their overall compliance with the principles was not sought as it was outside the scope of the exercise.To this end,a series of three pilot meetings were convened as part of the pilot and pilot agencies were allocated four months to complete the self-assessment questionnaire and provide feedback on the policy framework.A total of six law enforcement agencies from five countries participated in the project,namely,the:Brazilian Federal Police Central Directorate of the Judicial Police,France National Gendarmerie,France Netherlands Police New Zealand Police Swedish Police AuthorityWith the exception of the Brazilian Federal Police,each of the pilot agencies possesses operational FRT capabilities.The Brazilian Federal Police has implemented several FRT pilots,and operational capabilities are foreseen in the near future.Piloting to test and improve the policy frameworkA Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations9Law enforcement investigations:use cases and definitions1A description of how facial recognition technology is used in practice by law enforcement agencies.A Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations10FRT has many potential applications or use cases in law enforcement investigations,some of which will be touched upon in the sections that follow.These descriptions are intended to provide a better understanding of how FRT is or can be used by law enforcement agencies and to help illustrate the challenges that the governance framework seeks to address.The different examples presented have been informed by the practices of the Netherlands Police and INTERPOL.It is important to note that specific practices may vary from jurisdiction to jurisdiction,and that the use cases described do not refer to any specific laws,policies,principles or recommendations that would limit or regulate their use and are intended solely for illustrative purposes.Unlike fingerprints and DNA,faces change significantly over time and can even differ from one day to the next.For instance,ageing,cosmetics,plastic surgery,the effects of drug abuse or smoking,and the way the subject poses can all influence facial appearance.This is one reason why the result of using FRT is always considered an investigative lead,meaning that,at most,the subject proposed by the system remains a possible match and a potential candidate only even after a manual face comparison review by face experts.FRT can be used for what are referred to in practice as biometric“identification”and“verification”.Again,it should be emphasized that,notwithstanding this terminology,in the context of law enforcement the result of an FRT search remains an investigative lead and the system does not per se“identify”any individual.“Identification”(also referred to as“one to many”)consists of searching for the identity of a person,whereas the activity of“verification”(also referred as“one to one”)consists of verifying someones identity against an identity document(ID).28 In addition to the distinction between biometric identification and verification,a further distinction can be drawn between what is referred to as“real-time”or“post-event”facial recognition.So-called“real-time”facial recognition involves the use of live or near-live material,such as video feed,generated by a camera(real-time passive capture)or footage captured by an officer using a mobile device(real-time active capture).The comparison and identification occur concurrently with the capturing of the biometric data.By contrast,with post-event facial recognition,the comparison and identification occur significantly after the biometric data has been collected.Facial experts play a central role in the use of FRT systems and can be classified as facial examiners,reviewers or assessors.Facial assessors perform the least rigorous of facial comparison processes,carrying out only quick comparisons of image-to-image or image-to-person in screening and access control applications or field operations.Facial reviewers conduct comparisons of image(s)-to-image(s),generally resulting from the adjudication of a candidate list generated by FRT.Facial examiners are experts who perform an analysis of image(s)-to-image(s)using a rigorous morphological comparison and evaluation of images for the purpose of effecting a conclusion.In the case of the Netherlands Police and INTERPOL,for instance,the facial recognition search and comparison is performed by facial examiners who operate autonomously from the investigation teams;they do not have knowledge of the prosecution that requires them to run facial recognition analysis.29The Netherlands Police and INTERPOL are entities with two distinct mandates.As a national law enforcement body,the Netherlands Police has the mandate to conduct investigations and is required to testify and report the outcome of its expertise before a judge in court.INTERPOLs mandate,on the other hand,is,inter alia,to ensure and promote the widest possible mutual assistance between all criminal police authorities within the limits of the laws existing in the different countries and in the spirit of the Universal Declaration of Human Rights.To do so,INTERPOL manages databases accessible to its 195 member countries.INTERPOL also provides recommendations on best practices,forensic expertise and other specialized expertise,produces analysis,delivers training activities and provides operational support to its member countries.BOX 1The roles of the Netherlands Police and INTERPOLHow facial recognition is used for law enforcement investigationsBiometric verificationProbe imageReference imageBiometric identificationProbe imageReference databaseA Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations11Reference database of known criminals,suspects and missing personsReference database built specifically for an investigationReference databases are repositories of images to which law enforcement agencies have lawful access and against which a probe image is compared.In law enforcement investigations,it is common for the reference database used to be a database of known suspects and convicts,composed of mugshots lawfully collected and stored by the law enforcement agencies.People in such databases are still suspects or have usually been convicted of a crime.A reference database of known criminals,suspects and missing persons has been built over time by law enforcementFace images from the investigation are collected to create a special reference databaseAn image of a known criminal,suspect or missing person can be searched against the special reference databaseA probe image is compared against this reference database to check if the person is among known criminals,suspects and missing personsAlternatively,a special database can be built specifically for an investigation.In this case,the public prosecutor would authorize the seizure of video footage from a crime scene.Such a database can then be compiled from multiple sources(CCTV,social media,electronic devices,etc.),and all of the faces detected on the footage will be stored within it.The probe image of,for instance,a possible suspect can then be searched against the special database to see if the suspect is present on the footage.At the end of the investigation,the database will be removed from the operational system and stored for accountability purposes,and in the event that the file may need to be produced in court as evidence during a judicial procedure.A reference database of known criminals,suspects and missing persons has been built over time by law enforcementA probe image is compared against this reference database to check if the person is among known criminals,suspects and missing personsReference database of known criminals,suspects and missing personsA reference database of known criminals,suspects and missing persons has been built over time by law enforcementA probe image is compared against this reference database to check if the person is among known criminals,suspects and missing personsA probe image is collected from an image sourceThe probe image is compared against a reference databaseProbe images of suspects or persons of interest A probe image is collected from an image sourceThe probe image is compared against a reference databaseProbe images of suspects or persons of interest Probe imageTo identify an unknown person of interest,investigators work with probe images and reference databases:Probe images are facial photos of suspects or persons of interest that are part of the law enforcement investigation and are submitted to an FRT system to be compared to a database.Once a probe image is enrolled into an FRT system,a biometric template a mathematical representation of the features or characteristics from the source image is generated for subsequent processing by the system.To collect probe images,investigators(or digital/face experts)either already have an image of the suspect or they extract it from footage of videos/stills.In either case,they will seek to collect the best-quality image to ultimately improve the chance of identifying the person.Reference database of known criminals,suspects and missing personsA reference database of known criminals,suspects and missing persons has been built over time by law enforcementA probe image is compared against this reference database to check if the person is among known criminals,suspects and missing personsA probe image is collected from an image sourceThe probe image is compared against a reference databaseProbe images of suspects or persons of interest Reference database of known criminals,suspects and missing personsA reference database of known criminals,suspects and missing persons has been built over time by law enforcementA probe image is compared against this reference database to check if the person is among known criminals,suspects and missing personsA probe image is collected from an image sourceThe probe image is compared against a reference databaseProbe images of suspects or persons of interest A probe image is collected from an image sourceThe probe image is compared against a reference databaseProbe images of suspects or persons of interest A probe image is collected from an image sourceA probe image is collected from an image sourceThe probe image is compared against a reference databaseProbe images of suspects or persons of interest The probe image is compared against a reference databaseA Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations12The following process for using FRT for law enforcement investigations is based on the practices followed by the Netherlands Police other law enforcement agencies may follow slightly different processes,but the key principles will generally be the same:Step 1:A(possible)crime is reported or suspected.An investigation team under the supervision of the public prosecutor is created and,if required by local legislation,requests warrants to collect images relevant to the crime,including images of the suspect(s).If suspects are detected on the images,the team will try to determine their identity.This can be done by human means through recognition by people who know the suspects;for instance,police officers or witnesses or by requesting an FRT search.Step 2:If an FRT search is requested by the investigation team,a facial examination team will run FRT software to compare the probe image against one or multiple databases.Before doing so,the facial examiners will first manually assess the quality of the probe image.If it is deemed suitable for an FRT search,they will enter the probe into the FRT system and allow the system to do the pre-search analysis and may also provide some notable facial landmarks(centre of the eye socket,etc.)to the software.The examiners will then set up the FRT software at a setting that is not too narrow to avoid false negatives,which could lead to missing the suspect nor too wide to avoid false positives,which would result in a list of candidates too large to be of use.Step 3:After the search,the facial examiners analyse the list of candidate images proposed by the software.They will run this last operation manually,deploying their expertise to check if one of the images proposed by the system could be a possible match for the probe image.In order to avoid bias,the facial examiners should not be made aware of the background to the case.The outcomes of this step will be either a determination of a“possible match”or“no recognition”recorded,with a note of:1)dissimilarities observed;2)some similarities observed;3)many similarities observed;or 4)some similarities and some dissimilarities observed,leading to an inconclusive determination.Step 4:If the facial examiners confirm the conclusion of a“possible match”,the probe image and the image of the potential candidate from the reference database are handed to two facial experts for a blind peer review.30 During the blind peer review,the facial experts,independently of each other,perform a full analysis of the probe and the reference image to determine the similarity/dissimilarity of the two faces.The end result to be reported to the investigation team is the final conclusion reached by consensus or,in the event of a lack of consensus,the most conservative conclusion in terms of similarities observed will prevail.On the other hand,if the facial examiners in Step 3 reach a conclusion of“no recognition”,the probe image is handed to one other expert to run the entire search de novo in order to reduce the risk of false negatives.If the de novo search results in a“possible match”,a blind peer review by two other facial experts will additionally be carried out as described above.Following the communication of the final result,the investigation team will proceed to review the results of the search,seeking to corroborate or disregard the proposed candidates.After the search,the facial examiners analyse the list of candidate images proposed by the software.A Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations13The four-step process followed by the Netherlands Police when using facial recognition technologyAn investigation is launched and an investigation team gathers image evidence1If facial recognition is required,the investigation team asks the facial experts to run a facial recognition technology search resulting in a list of candidates2Facial examiners manually analyse the list of candidates provided by the systemif the experts reach a conclusion of“no recognition”the probe image is handed to another expert to run the FRT search de novo and Step 3 is repeated3 a blind peer review is then conducted by two other experts and a positive outcome is reported to the investigation teamif,and when,all three experts reach the same conclusionIf the facial examiners reach a conclusion of“possible match”.Investigative lead4A Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations14The following is a collection of scenarios intended to illustrate how FRT can be used for law enforcement investigations:Finding the identity of an ATM fraud criminalFraudulently obtaining bank account data by usurping an individuals identity can enable an unauthorized person to access a bank account and withdraw cash from an ATM machine.Video footage from the ATM machine enables investigators to collect a facial image of the offender.The quality of the probe image with regard to FRT searches will vary,depending on,for example,the light exposure and whether the individual has concealed their face.If the quality of the image is adequate,the image can be compared against a database of known criminals using an FRT system.Facial examiners will then analyse and manually compare the probe image with each candidate image and assess if there is a possible match or not.In the event that the examiner reaches the conclusion of a possible match,a peer review will be carried out by a second facial examiner and,if the two agree on the conclusion,they will subsequently share the possible match with the investigators as an investigative lead.Uncovering the identity of an assailant of police officers during a riotDuring a riot,footage of a person attacking police officers may be collected by CCTV cameras.If an investigation is launched,an investigation team will seek to obtain the images captured by the cameras with the goal of identifying the assailant.With the help of the law enforcement agencys digital experts,the investigators will review the CCTV footage of the riot,looking for images of the alleged assailant.They will endeavour to collect the images with the best angle,lighting and exposure possible to optimize the image quality,thus increasing the chances of obtaining possible matches and identifying the assailant.If the images collected are of adequate quality,they can be compared by facial examiners against a database of known criminals using an FRT system to assess if there is a possible match or not.In the event that the examiner reaches the conclusion of a possible match,a peer review will be carried out by a second facial examiner and,if the two agree on the conclusion,they will subsequently share the possible match with the investigators as an investigative lead.Step 1Analysis of collected footage to capture the face of the suspectStep 2Probe image of the suspect is collectedStep 3Comparison of the probe image against a reference database of known criminals and suspectsStep 1Analysis of collected footage to capture the face of the suspectStep 2Probe image of the suspect is collectedStep 3Comparison of the probe image against a reference database of known criminals and suspectsA Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations15Looking for the identity of a museum thiefFollowing a break-in and the theft of items of art from a museum,a public prosecutor launches a criminal investigation.Relying on police intelligence,the investigation team already has a known suspect in mind who has operated in the past with a similar modus operandi and accordingly the team wants to verify this intelligence by ascertaining if this individual was in the museum on the day of the theft and in the days before.To do so,the investigators will seek to collect images of all the faces of visitors and staff who appear in the museum security footage.This will be used to build an investigative special database.An FRT search will then be made against the special database using the probe image depicting the suspected thief that was collected as part of a previous investigation.A list of candidate images is displayed by the FRT system and then reviewed and analysed by a facial expert to establish whether a possible match is detected that could be used to confirm the possible connection of the individual with the break-in.In the event that the examiner reaches the conclusion of a possible match,a peer review will be carried out by a second facial examiner and,if the two agree on the conclusion,they will subsequently share the possible match with the investigators as an investigative lead.Step 1A probe image of the known suspect is collected from a previous investigationStep 3The probe image is compared against this special database to check if the suspect appears in the collected museum footageStep 2Video footage from the museum is collected to build a special database of faces of all recorded individualsUsing facial recognition to fight child abuse National law enforcement agencies and INTERPOL use FRT to investigate cases of child abuse.To dismantle international child abuse networks,INTERPOL runs investigations in partnership with national law enforcement agencies.Dedicated task forces within INTERPOL and national police departments collect images and pieces of evidence to facilitate the resolution of investigations.Images and videos showing victims of child abuse are stored in dedicated databases with highly restricted access.These databases are very often developed using a range of tools and features to support the work of investigators,help them to analyse the images and find new leads.FRT can be used to help identify the victims,by searching their facial images in a database containing the facial images of missing persons.Missing minors,however,are not necessarily recorded in these facial databases because the face undergoes many changes throughout childhood and adolescence.In most cases,law enforcement relies on other means to identify victims.FRT can also be used to check if the same child appears in various image sources(termed clustering)and to estimate the period during which the victim has been abused.The primary goal of all of these findings is to identify,locate and rescue the victim as soon as possible.Facial images of perpetrators,when collected and seized,can be searched in national criminal databases and in the INTERPOL criminal database in order to identify,locate and detain them with a view to prosecution.It is crucial for investigators to collect as much evidence as possible to document and strengthen the prosecution case,using all existing investigative tools,including FRT when relevant.Using facial recognition to find missing personsWhen there is serious evidence suggesting the need for international police cooperation in a missing persons case,national law enforcement agencies may ask INTERPOL to publish a Yellow Notice.A Yellow Notice is a request to law enforcement worldwide to help locate missing persons.31 This file usually includes facial images,as well as other biometric attributes,such as fingerprints and DNA,where they are available.Once the law enforcement agency of a member country requests a Yellow Notice to be published,an FRT search is performed by INTERPOL to check if the person was previously recorded in the facial recognition database;for example,by another country as a criminal.The Yellow Notice can be beneficial when a person is declared missing in a given country and found dead in another one.In this case,the Yellow Notice will help identity the deceased person.As,generally,databases of minors are not maintained,this use case is different in the case of missing children.With some exceptions,such as the National Tracking System for Vulnerable and Missing Children in India,the only way to identify missing children using facial recognition is by consulting investigation databases of child abuse cases and comparing images.32 A Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations16Step 1A Red Notice is created and facial images of the wanted person are recorded in INTERPOLs facial recognition reference databaseStep 2A probe image of a suspected wanted person is collected during a border control check and sent to INTERPOL for an FRT searchStep 3The probe image is compared against the reference database to check if the traveller is the wanted personIdentity checking at a border control Border officers use identity controls to,inter alia,detect and potentially detain fugitives and wanted persons who are the subject of an INTERPOL Red Notice a global police alert to locate and provisionally arrest a person pending extradition,surrender or similar legal action.33 Red Notices contain information about the individual that can be used to identify them.If there are facial images of the wanted person,these will be stored in INTERPOLs facial image reference database of criminals and missing persons the INTERPOL Facial Recognition System(IFRS).In the event that a national border guard controlling the identity of people crossing a border considers a traveller to be the possible subject of a Red Notice,the border guard may seek further verification of the individuals identity by taking their picture and fingerprints.In agreement with their national authorities,border officers may send the facial image to their INTERPOL National Central Bureau(NCB)and to INTERPOLs headquarters for an urgent search against wanted persons and criminals in the IFRS.Once received,INTERPOL facial examiners will run the search as soon as possible in the IFRS using the probe image provided and a list of candidate images will be proposed by the system.Facial examiners will then analyse and manually compare the probe image with each candidate image and assess whether a potential candidate emerges.If this is the case,a peer review will be carried out by a second facial examiner and,if the two agree on the conclusion,they will subsequently inform the concerned INTERPOL NCB and border agents.It is important to note here that,even if the collection of the probe image and the search are performed almost concurrently in near real-time the imperative to act fast in these situations does not prevent the outcome undergoing expert verification and accordingly the standard procedures remain unmodified.The use of real-time FRT for identification undoubtedly represents the most sensitive use case.The imperative to act fast for instance,to prevent a specific,substantial and imminent threat to the life or physical safety or a terrorist attack can,exceptionally,necessitate using FRT systems without the outcome undergoing expert verification.In this case,the system would automatically propose potential candidates based on live CCTV footage from public areas of interest or images collected by a law enforcement officer to be acted upon by investigators.In the absence of expert verification,the risk of the concerns outlined above are greatly exacerbated.As a result,public awareness of real-time FRT is uniquely heightened.Notwithstanding the validity of the concerns surrounding this particular use case,there is often an unfounded belief that real-time facial recognition is the primary application of the technology.In reality,however,the use of real-time FRT is more limited than is often perceived.To date,a wide range of law enforcement agencies have implemented limited real-time pilots,with only a few agencies opting to adopt the use case into operations.The post-event application of FRT remains,by large,the leading use case.In light of this,the guidance presented in this insight report is primarily based upon consideration of and tailored to the use of post-event FRT unless otherwise expressly indicated.That said,the guidance provided is equally applicable to both real-time and post-event uses of FRT.However,in the context of real-time FRT,additional safeguards and higher standards for the application of the proposed principles will need to be taken on board by law enforcement agencies seeking to employ this use case in order to address the extra concerns that it presents.BOX 2The use of real-time facial recognition technologyA Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations17Actively looking for a terrorist in public spaces Note:the following example is a potential use case and has not been activated by either INTERPOL or the Netherlands Police.In the aftermath of a terrorist attack,where the terrorist remains at large,CCTV footage may be obtained by law enforcement to collect a probe image of the fugitive terrorist.This probe image can then be distributed to all police patrols actively looking for the fugitive.In addition,the probe image can be compared in real time against live footage from CCTV cameras(or other image sources)located in the terrorists assumed vicinity,being streamed to an FRT system.This real-time comparison may generate a potential lead that can be sent to police patrols,which can be deployed to the area to investigate.Step 2Step 1Analysis of collected footage to collect a probe image of the terroristComparison of the probe image with other images collected from CCTV located in the terrorists assumed vicinityStep 4Step 3The video comparison leads to a possible matchPolice patrols are deployed to the area to investigate the leadA Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations18Principles2A global-and multistakeholder-developed set of principles for the responsible use of facial recognition technology for law enforcement investigations.A Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations19Note:The principles that follow have been identified as the foundations for ensuring that law enforcement agencies use FRT responsibly.Each principle contains a series of actions for them to either implement or take into consideration at the relevant stages of their decision-making process regarding FRT.The principles are not presented in any specific order of importance;however,Principle 1“Respect for human and fundamental rights”can,by its nature,be considered the overarching principle of this framework and viewed as the motivation underlying the design of each of the other principles.It should be noted that these principles have been designed primarily with the post-event FRT use case in mind.As previously observed,however,they are equally applicable to the real-time use of FRT although additional safeguards and higher standards for the application of the principles will be needed to cater for the nuances presented by real-time FRT.Furthermore,it should be noted that these principles focus on law enforcement investigations only.All other law enforcement activities related to passport,residence permit and ID card issuance/verification,etc.are not covered here and are outside of the scope of this policy framework.1.1 FRT should be used only as part of a lawful investigation,and always only as an investigative lead,to support the identification of criminals/fugitives,missing persons,persons of interest and victims.1.2 The rights provided for within the International Bill of Human Rights and other relevant human rights treaties and laws should always be respected,particularly the right to human dignity,the right to equality and non-discrimination,the right to privacy,the right to freedom of expression,association and peaceful assembly,the rights of the child and older persons,the rights of persons with disabilities,the rights of migrants,the rights of Indigenous people and minorities,and the rights of persons subjected to detention or imprisonment.The use of FRT by law enforcement for investigations should respect these rights and be necessary and proportionate to achieve legitimate policing aims.1.3 Any restrictions or limitations to human rights are permissible under international human rights law only if they are necessary and proportionate to achieving a legitimate policing aim and are not applied in an arbitrary manner.These restrictions must be established in law and should correspond to the least intrusive means of pursuing such an aim.1.4 Law enforcement agencies should be subject to effective oversight by bodies with enforcement powers in accordance with national laws or policies.Among other things,these or other bodies should have the specific task of hearing and following complaints from citizens and assessing the compliance of law enforcement activities with human and fundamental rights.1.5 Law enforcement agencies should consider setting up an independent ethical oversight committee or assigning the responsibility to periodically review law enforcement officers use of FRT to a pre-existing body,supporting them in achieving respect for human and fundamental rights.1.6 Individuals should have the right to an effective remedy before an independent and impartial tribunal set up by law against actions concerning the use of FRT.Respect for human and fundamental rights 1A Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations202.1 The decision to use FRT should always be guided by the objective of striking a fair balance between allowing law enforcement agencies to deploy the latest technologies,which are demonstrated to be accurate and safe,to safeguard individuals and society against security threats,and the necessity to protect the human rights of individuals.2.2 Law enforcement agencies considering the use of FRT should always provide a documented and justified argument as to why FRT is the chosen option and why alternative options were not chosen.2.3 The use of FRT by law enforcement agencies,from the request to the use of the outcome of the search,should always be aimed at,and limited to,a single specific goal,necessarily related to investigative purposes.2.4 International,regional and national policies and/or laws should specify for which classes of crimes or investigations the use of FRT by law enforcement agencies is acceptable and/or lawful.2.5 Acknowledging the right to privacy and other human rights,the collection of images from public and publicly accessible spaces for FRT identification purposes should be done only for a determined list of use cases,in a limited area and for an established processing time period in accordance with relevant national laws or policies.2.6 As a consequence of the additional risks involved in the use of real-time FRT,an independent authority responsible for oversight of law enforcement operations(such as the independent ethical oversight committee described in Principle 1.5)should be in charge of authorizing applications for its use and,if there is not enough time,it should be authorized through the chain of command.In such cases,the chain of command should inform the independent authority as soon as possible and not later than 24 hours after authorizing the use,justifying its decision to use real-time FRT and explaining why it considered there was insufficient time to seek its authorization in advance.Law enforcement should use the results of any real-time FRT search only to verify an individuals identity and conduct additional verifications.All images captured during an operation involving the use of real-time FRT,both the original image and the biometric template,should be deleted from the system,according to the policies governing the storage of live images.2.7 FRT,and other face analysis technologies,should be used for no purpose other than biometric identification/recognition/verification.The use of FRT to infer ethnicity,gender,sex,age,emotion,opinion,health status,religion and sexual orientation,and the use of FRT for predictive analysis,should not be permitted.3.1 Lines of responsibility for the outcome of a given use of FRT should be well defined and transparent.A law enforcement agency should never issue analysis and conclusions from FRT without interpretation by an examiner and oversight by a manager with the right expertise(with the unique exception described in Principle 2.6).3.2 The use of FRT should always be conducted by an individual trained as described in Principle 8(with the exception of situations of emergency as presented in Principle 2.6).The skills of facial experts are critical and necessary to maintain the highest level of accuracy in the identification process.3.3 A peer review(blind verification or examination by a second expert)should systematically be performed before a result is communicated to the requesting investigation team.The result provided should be consensus-based or,in the event of a lack of consensus,the most conservative conclusion in terms of similarities observed should prevail.3.4 The law enforcement agency should verify that a mechanism exists whereby citizens can file a complaint with or seek redress for any harms before a competent body designated by national authorities.3.5 If an individual proposed by an FRT system as a potential candidate is subsequently taken into custody,brought in as a witness or assumes any other official role in a law enforcement process,that person should be informed that he/she was subject to a search using FRT,provided that this would not compromise the investigation.Necessary and proportional use Human oversight and accountability 23A Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations214.1 Law enforcement agencies should require vendors to follow FRT standards,such as those set by the International Organization for Standardization(ISO)and the European Committee for Standardization(CEN),to evaluate the performance of their algorithms at the design and deployment stages.4.2 Law enforcement agencies should introduce a standardized procurement process in a transparent way,requiring vendors to comply with the above-mentioned standards and to submit their algorithms to large-scale independent audits/testing undertaken against appropriate test standards(lab tests and,if possible,field tests).After evaluating all candidates,agencies should select the provider who can demonstrate the best-performing algorithm.4.3 Due diligence with respect to system performance should be undertaken by reference to large-scale independent tests,such as those conducted by NIST in the US.These tests provide a scientifically robust,transparent baseline of performance.4.4 Independent lab tests to validate the performance of the FRT should be designed to model,as closely as practical,the real-world objectives and conditions(including data landscape,operators of the technology,timetables affecting decisions made using the technology,etc.)in which the FRT is applied in practice.4.5 Law enforcement agencies should notify the technology provider of relevant errors identified in order to have the system reviewed.4.6 To leverage accuracy gains,law enforcement agencies should expect to make,and establish procedures for,regular upgrades or replacement of the FRT.5.1 The risk of error and bias by machines and humans should be mitigated to the greatest extent possible.This should be done through an ex ante and ex post evaluation strategy:5.1.1 Ex ante evaluations:technology providers,and where it applies,technology integrators,should ensure biases and errors are mitigated to the greatest extent possible before the deployment of the system by law enforcement agencies.The level of performance across demographics and the design of the quality management system should be evaluated by an independent third party.This evaluation should be organized by the technology provider and the results made available to law enforcement agencies that procure FRT and to the public for review.Law enforcement agencies that procure FRT should require in their procurement criteria information about the specific metrics the provider uses to gauge bias and other relevant risks.Before deploying FRT systems,law enforcement agencies should set up pilot tests to ensure the system is operating as intended.5.1.2 Ex post evaluations:law enforcement agencies if necessary,with the support of competent authorities should deploy risk-mitigation processes to identify,monitor and mitigate the risks of error and biases throughout the entire life cycle of the system.A regularly programmed internal audit(that could include the use of the self-assessment questionnaire related to these principles)and,if possible,an independent third-party audit should be conducted to validate the robustness of these processes.The conclusions of these audits should be made publicly available.To continually improve the quality of the processes and the systems performance,law enforcement agencies,technology providers and technology integrators should establish an in-service support agreement throughout the entire life cycle of the system.Optimization of system performance Mitigation of error and bias45A Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations226.1 Law enforcement agencies should ensure that their processing of probe images and reference databases are compliant with international,regional and national laws and/or policies,which should include storage criteria,purpose limitation,retention period,deletion rules,etc.6.2 The collection of probe images should be conducted on a legal basis and aimed at a specific purpose.6.3 The reference database(s)used for FRT investigations should always have a legal basis and be used under the authorization of competent authorities.Consequently,reference databases that include data collected without legal basis from the internet,electronic devices or other sources should not be used.6.4 Probe images should not be inserted into the reference database by default.Probe images of unidentified subjects may be stored in a database for further investigation;however,such images should be appropriately labelled(e.g.as an unidentified suspect or unidentified victim)and the reasons for their insertion into the database detailed.Differently labelled categories of image can be stored on the same database but should be logically separated so that facial experts can,with requisite authorizations,independently search the specific categories.Additional care should be afforded to ensure that,if the underlying status justifying the insertion of the probe image into the database(e.g.as an unidentified suspect)changes,the image is removed from the database.6.5 Exporting images and biometric metadata to public cloud-based FRT that could potentially be outside the local jurisdiction should be prohibited.6.6 Law enforcement agencies should maintain a strict and transparent chain of custody of all images(probe image sets and reference databases)used for FRT.The law enforcement agency should specify,and enforce,clear and transparent rules designating who does and does not have access to the images,and under what circumstances.6.7 Law enforcement agencies should specify well-defined protocols for determining when,and on the basis of what criteria,images are to be deleted from a probe set or a reference database.The law enforcement agency should create,and adhere to,a well-defined and transparent protocol for the disposal of images that have been deleted from a probe set or reference database or are otherwise no longer needed;any such protocol should be designed to protect the privacy of any individuals appearing in the images identified for disposal.6.8 For all solved cases or for cases where the investigation has been concluded,the biometric template of the probe image should be deleted from the FRT system and the original facial image stored for accountability purposes in line with existing national law and policies.Legitimacy of probe images and reference databases6A Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations237.1 Law enforcement agencies should establish standards and thresholds of image quality for reference database images in order to mitigate the risk of errors.Reference database images that do not meet the defined standards and image-quality thresholds should not be used.7.2 Law enforcement agencies should also establish best practices to evaluate image quality for probe images.Before any search using an FRT system,the facial examiner should conduct a manual assessment of the image to ascertain if the probe image is of a high-enough quality to conduct a facial comparison.If the expert is unable to do so manually,the probe image should be rejected.Although a minimum number of pixels between the eyes is often recommended,care should be taken when using this as a threshold as it is often insufficient to confirm image quality.7.3 Standards for probe images and reference database images should be identified by each law enforcement agency,taking into account the strength of the algorithm,the results of internal testing of the FRT system,the nature of the use case and any recommendations from the technology provider regarding its specific system.Standards,such as International Civil Aviation Organization(ICAO)photo standards,may serve as guidance for assessing image quality of reference database images.Guidance on best practices for probe images and additional recommendations for reference database images could also be provided by groups such as the Facial Identification Scientific Working Group(FISWG),the European Network of Forensic Science Institutes Digital Imaging Working Group(ENFSI-DIWG)and the INTERPOL Facial Experts Working Group(IFEWG).7.4 Law enforcement examiners should be aware of the risk of image manipulation,such as morphing and deepfakes,when images come from uncontrolled sources and/or production modes.When suspected,these images should be rejected or processed with extreme precaution.7.5 Forensic upgrading(e.g.contrast and brightness correction)should comply with existing published guidance or standards(such as by FISWG).7.6 The use of tools for non-forensic upgrading(e.g.pose correction)should be used only during the FRT search phase.If non-forensic upgrading is carried out,the insertion or modification of facial features or geometry on an existing image should be performed with care in order to avoid distortion of the image.7.7 In case of a possible match,and to reach a final conclusion,forensic upgrading of face quality only should be accepted.For reporting purposes,the original image should be presented together with the description of forensic upgrading methods to ensure the auditability and reproducibility of the upgrading process.7.8 While processing data,law enforcement agencies should always conduct a proper and verified attribution of identity to photos in the reference dataset,and verify the serial number of photos,their traceability and origin.7.9 The integrity of the reference database should be evaluated regularly,in accordance with the applicable legal framework and best practices.7.10 Vulnerabilities to hacking and cyberattacks should be identified to ensure robustness and avoid data leaks and data manipulation.Integrity of images and metadata7A Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations248.1 FRT should be used only by trained persons who follow the procedures ordered through the chain of command and/or by management.8.2 Everybody within the organization,especially the chain of command/management,should understand the capacities and limits of the technology and system used.8.3 Law enforcement agencies that use or intend to use FRT should provide or facilitate training on an ongoing basis and should be informed by the latest research in machine learning and remote biometrics.8.4 The training(and certification when it applies)of facial experts,and those in the chain of command/management,should include:8.4.1 Knowledge of and updates of mandatory regulations,laws or policies concerning the use of biometrics.8.4.2 Awareness of the risk of biases by the FRT system(anticipation of false positives and false negatives,awareness of differences in performance on various demographics,knowing how to calibrate and adjust the threshold of the system,understanding how to configure the system in the manner appropriate to the specific circumstances and risks of a given use case,and how to fix the length of the candidate lists).8.4.3 Understanding of the risk of biases by the human agent(overestimation of own capability,risk of over-reliance on technology,blind spots,risk of human bias such as other-race-effect bias).8.4.4 Awareness of the risk of false positives from twins,siblings and other related individuals.8.4.5 Awareness of the risk of image manipulation,including data integrity attacks and data morphs,and,when available,the tools to identify them.8.4.6 How to implement risk-mitigation methodologies(one match vs.differential diagnosis approach,blinding techniques,blind verifications,etc.).8.4.7 Understanding of the nature of an investigative lead as the outputs of an FRT search and best practices for verifying the identity of leads generated.8.4.8 Instruction in data governance procedures,including the collection,storage,integrity and traceability of data.8.4.9 How to use tools,when available,that assist examiners in understanding the reasoning behind systems decisions/recommendations.8.5 Recognizing that innate capability to recognize faces exists on a spectrum,examiners should be recruited by factoring in performance on face comparison tests,acknowledging that experience and training also matter.Skilled human interface and decision-making8A Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations259.1 Information about the use of FRT by law enforcement agencies should be available to the public.This information should be made available on a permanent basis or on request,and communicated by the appropriate official authorities,be it the law enforcement agency itself or another government entity.9.2 Law enforcement agencies,or the most appropriate other official authority with input from the law enforcement agency should,in line with the applicable laws and policies,make public:9.2.1 A clear definition of the use of FRT for law enforcement investigations,specifying the purpose and objectives,such as to identify criminals/fugitives,persons of interest,missing persons and victims.9.2.2 The vendor selected(if applicable)and the name and version of the software.9.2.3 How they use probe images:procedures and criteria to select,store/not store images and,if stored,for how long.9.2.4 How they use the reference database:procedures to consult the database,criteria to select,store/not store probe images in this reference database and,if stored,for how long;as well as details about whether this database can be used to train or refine other FRT systems or machine learning models in general.9.2.5 The policy regarding the type of data that may be shared with other organizations,including personal data and databases of face images.9.2.6 The name of law enforcement departments or units able to launch searches and view the results of searches.9.2.7 The functional title,type of expertise and level of training of individuals using the system.9.2.8 The process to determine a possible match,namely blind-review or peer-review of possible matches.9.2.9 Information about the mechanisms in place(see Principle 1.5)to ensure FRT is used as intended.9.2.10 Auditable records of search requests made by law enforcement agencies,such as the number of requests,the number of investigative leads generated and the type of crimes related to these requests.9.2.11 The results of audits and/or evaluations of the performance of the FRT system conducted by the vendor of the technology and/or by the law enforcement agency.This should include a description of:the design of the evaluation;the data used in the evaluation;and the results(metrics)obtained.9.2.12 Information about how an individual can contact the law enforcement agency to submit a query or complaint concerning its use of FRT.9.2.13 A record of complaints filed by members of the public against the use of the FRT and the law enforcement agencys response of those formal complaints.9.2.14 Any other information that can be publicly shared without compromising law enforcement investigations and that may be relevant for the public.9.3 Information made available to the public should be concise,easily accessible,understandable and provided in clear and plain language.Exceptions to this should be permitted only if they are necessary and proportionate to pursue legitimate purposes and in accordance with the law.Transparency9A Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations26Self-assessment questionnaire3A self-assessment tool to support law enforcement agencies in ensuring they have introduced the measures needed for responsible facial recognition.A Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations27Note:This self-assessment questionnaire has been designed to reflect the preceding principles and is intended to support law enforcement agencies to develop policies surrounding the use of FRT and to review existing policies in line with the proposed principles.It does so by prompting law enforcement agencies to consider how they approach the use of FRT and the rules and procedures they may or may not have in place to responsibly govern the use of FRT in investigations.The self-assessment questionnaire is intended to serve as a tool to support law enforcement agencies on a continuous basis throughout their use of FRT and,accordingly,should not be considered as a one-off exercise or checklist.It is recommended that agencies regularly run the process of completing the self-assessment questionnaire or reviewing the relevant parts,as follows:1.Before implementing FRT for the first time 2.Before employing FRT in the context of a new use case3.After every software update to the core algorithm of the FRT system4.After changes in the current policies that have an impact on the software,databases or practices concerning the use of FRT Completing the self-assessment questionnaire will require consultation with multiple stakeholders(both internal and external),including but not limited to the FRT system provider,biometric experts,IT experts,and legal advisers.It is recommended that the individual(s)completing the questionnaire endeavour to answer all questions,reaching a single conclusion,that the agency is either:1.Compliant2.Non-compliant,with a clarification of why not3.Non-compliant,with a statement of actions that can be taken for improvement4.Non-compliant,with a statement that action cannot be taken and a clarification of why notIt is recommended that once completed,the final result,along with an explanation and summary of the outcome of the self-assessment questionnaire,is made public to increase transparency and accountability.1.1 Does your use of FRT for law enforcement investigations respect the International Bill of Human Rights and other relevant human rights treaties and laws?1.2 Is the output of an FRT search always considered only as an investigative lead?1.3 What procedures are in place to guarantee that restrictions or limitations to some human rights are allowed only if they are necessary and proportionate to achieving a legitimate policing aim?1.4 Are you working with oversight bodies to effectively assess the compliance of law enforcement activities with human and fundamental rights?1.5 Are these bodies tasked with hearing and following complaints from citizens?1.6 Is there an independent ethical oversight committee to periodically review your use of FRT and support you to achieve respect for the human and fundamental rights?1.7 Is there an existing judicial authority that offers effective remedies to individuals who consider their rights to have been violated through the use of FRT?Respect for human and fundamental rights1A Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations282.1 What uses of FRT are allowed in your jurisdiction and what is the basis in applicable international,regional and national laws or policies?2.2 What was the objective that guided the decision to use FRT?2.3 Which alternatives were considered before taking the decision to deploy FRT in your agency,and what were the criteria that ultimately led to the decision to reject those alternatives?2.4 How do you ensure that your use of FRT,from the request to the use of the outcome of the search,is appropriate,limited and exclusively related to investigative purposes?2.5 What uses of FRT are allowed in your jurisdiction(based on laws defined by international,regional and national laws or policies)?2.6 What are the use cases for which you are authorized to collect images from public spaces for FRT identification?2.7 What processes and controls are in place to ensure that the collection of images from public and publicly accessible spaces for FRT identification purposes is done only for a determined list of use cases,in a limited area and for a finite time period?2.8 What procedures are in place governing work conducted with independent authorities in charge of authorizing real-time uses of FRT for identification purposes under exceptional circumstances?2.9 In cases where your agency deploys real-time FRT,is there an independent authority or an established ethical oversight committee(see Principle 1.5)that regulates its use?2.10 If real-time use of FRT is authorized through the chain of command because of a lack of time to inform the independent authority,what processes have you introduced to ensure that the chain of command informs the independent authority within 24 hours and justifies its decision to use real-time FRT,outlining why it felt there was insufficient time to obtain authorization in advance of its use?2.11 In cases of real-time use of FRT,what processes have you implemented to make sure all images recorded by the real-time FRT system,including the biometric template and the original face image,are deleted,according to the defined policies for the storage of live images?2.12 What processes have you implemented to prevent the use of FRT to infer ethnicity,gender,sex,health status,age,emotion,opinion,religion or sexual orientation recognition or for predictive analysis?3.1 What processes have you introduced to ensure that an FRT output is always verified by an examiner with oversight by a manager with the appropriate level of expertise(except in the case described in Principle 2.6)?3.2 How do you ensure the FRT system is always used by individuals trained as suggested on Principle 8(except in the case described on Principle 2.6)?3.3 Is a systematic peer review performed before reaching any final decision?3.4 When two experts are assigned to evaluate the results,how do you reach a consensus between the examiner and reviewer(s)?3.5 What mechanisms are in place for citizens to file a complaint with or seek redress from a competent body?3.6 Do you inform individuals taken into custody,brought in as a witness or involved in an investigation that they were identified using an FRT system,provided this does not compromise the investigation?Necessary and proportional useHuman oversight and accountability23A Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations294.1 What existing or forthcoming standards do you ask your vendor to follow to evaluate the performance of your FRT system?4.2 Have you introduced procurement rules to select providers who comply with these standards of performance?4.3 Have you introduced procurement rules to select providers who have submitted their FRT system to an independent evaluation such as that organized by NIST?4.4 Have you selected the technology provider who presented the best results?4.5 Are the independent lab tests of performance designed to model,as closely as possible,the real-world objectives and conditions in which the FRT is applied in practice?4.6 Do you notify the technology provider when you identify relevant errors in the use of the FRT system?4.7 What procurement rules have you introduced to ensure the regular upgrading or replacement of the FRT?5.1 How is your technology provider(or where it applies,the integrator)making sure that biases and errors are mitigated to the greatest extent possible before the FRT systems deployment?5.2 Has the FRT software been tested by an independent third-party organization on the level of performance across different demographic groups?5.3 Has the design of the quality management system of the FRT system been evaluated by an independent third-party organization?5.4 Have technology providers and integrators communicated the results of those evaluations to law enforcement agencies and the general public?5.5 Do your procurement criteria require information to be supplied about the metrics that technology providers use to gauge bias and other relevant risks?5.6 Did you set up pilot tests before deploying the FRT system?5.7 Have you deployed risk-mitigation processes to identify,monitor and mitigate the risks of error and biases throughout the entire life cycle of the system?5.8 Have you programmed internal audits and,if possible,an independent third-party audit,to validate the robustness of your risk-mitigation processes?If yes,have you publicly shared the results of these audits?5.9 Have you established an in-service support agreement throughout the entire life cycle of the system in collaboration with technology providers and integrators?Optimization of system performanceMitigation of error and bias45A Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations306.1 Is your processing of probe images and reference databases,including storage criteria,purpose limitation,retention period and deletion rules,compliant with international,regional and national laws or policies?6.2 What processes have you introduced to ensure that the collection of probe images is conducted on a legal basis and aimed at a specific purpose?6.3 How do you ensure that images contained in your reference databases are collected only with a legal basis?6.4 Do you label unidentified probe images according to their corresponding categories e.g.as“unidentified suspect”or“unidentified victim”?6.5 Do you store unidentified probe images in your reference databases?If yes,can they be searched separately?6.6 Do you remove unidentified probe images from the unsolved probe database if an images underlying status,which justified the images insertion in the database,changes?6.7 What technical measures have you put in place to prevent the export of images and biometric metadata to public cloud-based FRT systems that could potentially be outside the local jurisdiction?6.8 How do you ensure a strict and transparent chain of custody of all images(probe image sets and reference databases)?6.9 Are there clear and transparent rules designating who does and does not have access to probe images and reference databases and under what circumstances?6.10 Have you established clear and transparent protocols for determining when,and based on what criteria,images are to be deleted from a probe image set or a reference database,taking into particular consideration the need to ensure the protection of the privacy of any individuals appearing in such images?6.11 Is the biometric template of the probe image deleted from the FRT system for all solved cases or for cases for which the investigation has been concluded?6.12 For all solved cases or for cases for which the investigation has been concluded,is the original facial image stored in line with existing national law and policies for accountability purposes?Legitimacy of probe images and reference databases67.1 Have you established image quality standards for reference database images?7.2 Do you exclude reference images that do not meet those quality standards?7.3 Do you have a procedure in place to perform an image quality assessment of the probe image before any FRT search is launched?7.4 Have you established a threshold of a minimum number of pixels between the eyes for the probe image to be used?7.5 Do you exclude probe images that do not satisfy a manual assessment of image quality?7.6 What quality reference standards and thresholds are you following?Have you considered best practices and recommendations,such as those presented by ICAO,FISWG,ENFSI/DIWG and IFEWG?7.7 How do you manage the risks of image manipulation(deepfakes,morphing,etc.)?Do you deploy a specific procedure to detect them when you collect images from uncontrolled sources?Integrity of images and metadata7A Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations318.1 Is FRT used only by trained persons?8.2 Does everybody within the organization understand the capacities and limits of the technology and system used?8.3 Is a training programme offered and,if so,how often is it offered?8.4 How do you evaluate the quality of the training programme over time,taking into consideration the latest progress in research(e.g.have you established a scientific committee or equivalent,etc.)?8.5 Have you ensured that the training(and certification when it applies)of face experts and agents within the chain of command/management includes information about:8.5.1 Mandatory regulations,laws or policies concerning the use of biometrics?8.5.2 Risk of machine biases related to FRT systems?8.5.3 Risk of human biases when using FRT systems?8.5.4 Risk of false positives from twins,siblings and other related individuals?8.5.5 Risk of image manipulation,including data integrity attacks and data morphs,and training on existing or new tools used to detect them?8.5.6 Implementation of risk-mitigation methodologies?8.5.7 Nature of the investigative leads and best practices for verifying the identity of leads generated?8.5.8 Data governance procedures,including the collection,storage,integrity and traceability of data?8.5.9 Use of tools that assist examiners in understanding the reasoning behind systems decisions/recommendations?8.6 Have you implemented recruitment processes to primarily hire examiners who perform well on standardized face comparison tests?Skilled human interface and decision-making87.8 If you detect a manipulated image(deepfake,morphing,etc.),how do you process this image?7.9 If you perform forensic upgrading of face quality,which methods of image processing do you use?Can any of these processes be considered to modify the original face features,adding or removing data from the image?7.10 Do you comply with published guidance or standards(such as by FISWG)when using tools for forensic upgrading of face quality?7.11 How do you ensure that non-forensic upgrading of face quality is used only during the search phase?7.12 In case of a possible match,do you use the forensic upgraded image for final conclusions?7.13 How do you document forensic upgrading to ensure the auditability and reproducibility of the upgrading process?7.14 What processes do you follow to ensure the proper attribution of identity to photos in the reference dataset and to verify the serial number of photos,as well as their traceability and origin?7.15 Have you performed a system security verification to identify vulnerabilities to hacking and cyberattacks?A Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations329.1 Is information about your use of FRT publicly available on a permanent basis or by request?9.2 Have you,or another official authority with input from your agency,publicly shared information about:9.2.1 The purpose of the FRT solution deployed and a clear definition of its use and the various FRT use cases?9.2.2 The vendor and the name and version of the selected software?9.2.3 Your processes regarding the use of probe images,including procedures and criteria to select,store/not store images and,if stored,for how long?9.2.4 Your processes regarding the use of reference databases,including procedures to consult the databases,and criteria to select,store/not store probe images in this reference database and,if stored,for how long?9.2.5 Information of whether the reference databases can be used to train or refine other FRT systems or machine learning models in general?9.2.6 The policy regarding the type of data that may be shared with other organizations,including personal data and databases of face images?9.2.7 The list of law enforcement departments that have access toFRT search requests?9.2.8 The functional title,type of expertise and level of training of individuals using the system?9.2.9 The process to determine a possible match process,namely blind-review or peer-review of possible matches?9.2.10 Information about the mechanisms in place(see Principle 1.5)to ensure FRT is used as intended?9.2.11 Auditable records of search requests made by law enforcement such as the number of requests,the investigative leads generated and the type of crimes related to the requests?9.2.12 The results of audits and/or evaluations of the performance of the FRT system conducted by the vendor of the technology?9.2.13 The results of audits and/or evaluations of the performance of the FRT system conducted by the law enforcement agency?9.2.14 Information about how an individual can contact law enforcement to submit a query or complaint?9.2.15 A report presenting the complaints,and responses from law enforcement agencies to citizens complaints about the use of FRT?9.3 How do you ensure that the information provided to the public about law enforcements use of FRT is concise,easily accessible,understandable and provided in clear and plain language?Transparency9A Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations33ConclusionThe deployment of FRT for law enforcement investigations around the world is arguably among the most sensitive use cases of facial recognition due to the potentially disastrous effects of system errors or misuses in this domain.The rapid pace and the extent to which FRT has been integrated into law enforcement has served,for many,to underscore the pressing need to take action to mitigate these risks as much as possible.At the same time,public expectations of law enforcement are exceptionally high and law enforcement is increasingly under pressure to effectively solve crimes and serve justice faster and faster.In the face of ever-more complex and dynamic criminal activities and limited resources,many in the law enforcement community feel that FRT is not only as option,but a necessity.This insight report is about balance.It suggests that a balance can be struck between the exigencies of law enforcement to innovate and use new technologies to investigate criminal activities and the need to address concerns voiced by critics surrounding this particularly controversial technology.The set of principles contained in this report serves as a proposal for what a robust governance response could look like.It takes into account the diverse perspectives of law enforcement,industry and civil society and has been developed with a global perspective in mind,striving to support not only law enforcement agencies in all countries across the globe,but also policy-makers and technology providers in this field,as well as keeping the general public informed about the current status of FRT in law enforcement.The work to develop this framework has benefited significantly from the pilot exercise conducted in the first half of 2022.The results of the pilot have served to improve the overall quality of the framework and to ensure that what is presented is actionable,relevant and useable in an operational law enforcement context.The collaboration and participation of the Brazilian Federal Police,the Central Directorate of the Judicial Police of France,the National Gendarmerie of France,the Netherlands Police,the New Zealand Police and the Swedish Police Authority have,in this regard,been invaluable in creating this unique output.A Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations34The pilot exercise served to clearly demonstrate that very different procedures exist from agency to agency,which in turn shows a lack of standardization and evidences the absence of guidance to facilitate such standardization.A consensus formed around one aspect in particular,however,and could be seen in the agencies diverse procedures,namely the importance of the human element of the use of FRT.This human element manifested in three ways.First,it is essential that the human being understands the technology its functioning,its use and its limitations in order to be in a position to be able to mitigate the risks.Second,agencies agreed that any output of an FRT search should be reviewed by a trained facial expert.Third,even after this review,the conclusion of the search remains always and solely an investigative lead to be verified by investigators.Collectively,this serves to ensure that a human being is always central to the use of FRT and that identification is never automated.The risk of unfortunate instances of wrongful arrests resulting from the use of FRT can be minimized if this approach is strictly implemented in the manner proposed in this framework.The pilot has additionally shed light on three other key areas that need additional attention in future:Transparency and communication with the public about the use of FRT was recognized as a significant challenge for law enforcement agencies.Many agencies highlighted and demonstrated a clear understanding of the importance of this element as a means to build public trust,although they voiced concerns about their own inexperience in this regard and the lack of practical guidance to support them to improve transparency.Training was repeatedly identified as being instrumental to realizing the ambitions of the entire framework proposed.The pilot exercise demonstrated clearly that training was not always consistently addressed by law enforcement agencies,with great disparity being seen in terms of the nature,scope and duration of training provided to officers using FRT systems.Going beyond the training of users,it is also vital to ensure that decision-makers in law enforcement equally receive adequate training to enable them to develop and implement internal governance frameworks for the use of FRT in their agencies.Real-time FRT presents unique challenges,and law enforcement agencies need additional tailored guidance.Although the belief that real-time FRT is the primary application of FRT in law enforcement is unfounded,several pilots of passive real-time FRT have been conducted across the globe and the use of mobile devices for active real-time FRT is growing.While this framework addresses such uses of FRT by law enforcement,further consideration is needed of the additional safeguards and standards that would be required to ensure the outcome of a process involving real-time FRT is reliable and accurate.Having developed,tested and validated the principles and the complementary self-assessment questionnaire,attention now shifts to leveraging and scaling the work done.Of primary importance in this regard is the need to initiate efforts to encourage decision-makers in law enforcement agencies and national policy-makers to take on board this framework as a guide for their agencies use of FRT and,ultimately,in the creation or amendment of related rules,procedures and legislation for the use of this technology by law enforcement.The law enforcement community at large,as well as policy-makers at the national and international level,industry partners,civil society organizations and academia engaged in the global debate about the governance of FRT are encouraged to join in these efforts and to promote the adoption and deployment of governance frameworks such as this.A Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations35Accuracy of facial recognitionThe accuracy of an FRT system is based on the number of correct predictions,which consist of a combination of two so-called“true”conditions:True positives:when the FRT correctly identifies a person enrolled in the system.True negatives:when the FRT correctly finds no match for a person who is not enrolled in the system.Accuracy is defined as the percentage of correct predictions,i.e.it is calculated by dividing the number of the two types of correct predictions by the number of total predictions.AlgorithmA series of instructions to perform a calculation or solve a problem,implementable by a computer.Algorithms form the basis of everything a computer can do and are,therefore,a fundamental aspect of all FRT systems.AuditVerification activity,such as an inspection or examination of a process or quality system,to ensure compliance with requirements.Bias in facial recognition technologyFalse positives and false negatives rate variations caused by a specific factor;for example,demographic dependencies across groups defined by sex,age,religion,race or country of birth.This lack of accuracy is usually caused by the training dataset of the algorithm,which does not contain enough or accurate representations of the demographics in each case.Biometric identificationApplications that use biometric comparison to verify a biometric“claim of identity”.Biometric recognitionAutomated recognition of individuals based on their biological and behavioural characteristics.It encompasses both biometric verification and biometric identification.Automated recognition implies that a machine-based system is used for the recognition,either for the full process or assisted by a human being.BiometricsA variety of technologies in which unique identifiable attributes of people,including but not limited to a persons fingerprint,iris print,handprint,face template,voice print,gait or signature,are used for identification and verification.Biometric templateA set of stored biometric features.A biometric template is created by converting a probe image into a mathematical file of characteristics,distinct from the original facial image,that can be used for subsequent authentication and verification activities.Biometric verificationApplications that search a database of the biometric characteristics of known individuals to find and return the identifier attributable to a single individual.Clustering(NxM)The automated grouping of biometric samples for example,a collection of facial images based on computer-evaluated similarities.In the case of FRT,this can be used to check if the same person appears in various image sources.Computer visionA field of computer science that works on enabling computers to identify and process images in a way similar to how humans perform these actions,and then provide appropriate output.ExplainabilityA property of AI systems that provides a form of explanation for how outputs are reached.Explainability is important to improve decision understanding and increase the trust of operators and users of the FRT systems.Face detectionThe automatic process of finding human faces by answering the question,“Are there one or more human faces in this image?”Face detection differs from face identification/verification as it does not involve biometric analysis.Face identification(or one-to-many)The process of answering the question,“Is this unknown person the same person as in any of the images in a reference database?”Identification compares a probe image to all of the images stored in a reference database,so it is also called“one-to-many”matching.A list of candidate matches is returned based on how closely the probe image matches each of the images from the reference database.Face verification(or one-to-one)The process of answering“yes”or“no”to the question,“Are these two images depicting the same person?”In security or access scenarios,verification relies on the existence of a primary identifier(such as an ID),and facial recognition is used as a second factor to verify the persons identity.GlossaryA Policy Framework for Responsible Limits on Facial Recognition Use Case:Law Enforcement Investigations36Facial assessor/reviewer/examinerThree distinct categories of roles in the process of

    浏览量37人已浏览 发布时间2022-11-09 43页 推荐指数推荐指数推荐指数推荐指数推荐指数5星级
  • FSIC:面部识别的未来及其对少数群体的影响(英文版)(19页).pdf

    1994年,巴利开始在拉斯维加斯的赌场中使用面部识别技术。虽然人类可以使用摄像头在这里跟踪一个人,但面部识别技术的出现使计算机能够同时实时跟踪数百人。这最初是一种打击欺骗和欺诈的解决方案,后来演变成了随着赌场很快了解到顾客对游戏和饮料的偏好,一种营销工具。快进到2007年,澳大利亚推出了“智能门”,它使用生物识别护照,通过面部识别和护照照片对一个人进行验证。如今,面部识别技术(FRT)已经取得了巨大的飞跃,不再需要与标记图像或护照照片数据库进行比较。现在,人工智能(AI)技术可以通过搜索互联网,仅仅从一个未连接的图像、视频或音频文件中获取信息,就可以确定一个人是谁。还是可以?

    浏览量37人已浏览 发布时间2022-07-12 19页 推荐指数推荐指数推荐指数推荐指数推荐指数5星级
  • Splio:从微信粉丝中识别和吸引潜在客户的最佳实践和案例研究(英文版)(55页).pdf

    全渠道CRM:您所有的客户数据和交易在一个地方:天猫、京东、微信门店、线下门店。了解你的客户,了解谁从你这里购买东西,通过什么渠道,以什么频率购买。营销自动化:瞄准并发送个性化和细分的活动。忠诚度引擎:通过识别和奖励忠诚度来提高终身价值和销售额,不仅是在交易中,而且是在与你的品牌的每次互动中。

    浏览量35人已浏览 发布时间2022-06-21 55页 推荐指数推荐指数推荐指数推荐指数推荐指数5星级
  • 未来今日研究所(FTI):2022年识别、测试与隐私趋势报告(英文版)(61页).pdf

    我们还强调了包括各级政府在内的大多数行业的新兴或非典型威胁。对于那些在创意领域的人,你会发现大量的新想法,会激发你的想象力。我们的框架将近600种趋势划分为13个明确的类别,并作为单独的报告发布。

    浏览量42人已浏览 发布时间2022-05-23 61页 推荐指数推荐指数推荐指数推荐指数推荐指数5星级
  • MobiDev:面向企业主的光学字符识别(OCR)技术(英文版)(18页).pdf

    这就是为什么OCR通常用于业务流程优化和自动化。OCR的输出进一步用于电子文档编辑和压缩数据存储,也形成了认知计算、机器翻译和文本到语音技术的基础。

    浏览量56人已浏览 发布时间2022-05-16 18页 推荐指数推荐指数推荐指数推荐指数推荐指数5星级
  • IAS:2021年广告上下文的生物识别研究报告(英文版)(29页).pdf

    匹配广告是指与周围内容对齐或“匹配”的广告。平均而言,与周围文章匹配的4个广告在细节和全局记忆方面都比不匹配的广告表现得更好。这意味着消费者更有可能记住行动呼吁和匹配广告的一般主题。地方性匹配广告解决了文章所呈现的问题,比如在一篇关于电影的文章中提供电影票。消费者最有可能记住广告细节,当他们匹配周围的文章信息。此外,地方性匹配广告创造了最高的情感联系,情感强度提高了43%。主题匹配广告与周围内容的主题相似,就像在一篇关于夏天的文章中有一个夏季饮料广告。与文章主题相匹配的广告在全球记忆中提升最高。例如,消费者最有可能记住夏季主题的广告和类似的季节性文章。信息性广告是指以读者必须处理的细节为特征的广告,如产品提供或具体的行动号召。这些类型的广告应该引起一个细节记忆反应,因为它们要求读者回忆特定的元素,以激活。当信息性广告的信息与周围文章的信息相匹配时,其细节记忆提升最高。

    浏览量31人已浏览 发布时间2021-09-22 29页 推荐指数推荐指数推荐指数推荐指数推荐指数5星级
  • 智慧芽:2021人脸识别行业白皮书(69页).pdf

    中游技术层构成:中游由视频人脸识别、图片人脸识别和数据库对比检验等技术层构成,大体包括人脸检测、人脸预处理、特征提取、比对识别、活体检测五大步骤,是推动下游场景应用拓展的关键所在。各步骤作用:其中人脸检测、人脸预处理、特 征提取可统称为人脸视图解析过程,即从视频和图像中检测出人脸,通过图像质量判断,选取合适的人脸图片,提取人脸特征向量,用于后续比对识别;比对识别处理可以分为人脸验证(1:1)和人脸辨识(1:N)两类,活体检测算法用以判断人脸识别处理中的人脸图像,是否采集自真实人体。目前主流的人脸识别算法有:基于人脸特征点的识别算法;基于整幅人脸图像的识别算法;基于模板的识别算法;利用神经网络进行人脸识别识别的算法等。随着深度学习技术的普及,各大公司的人脸算法效果差距也越来越小。算法精度与准确率:美国国家标准与技术研究院(NIST)的全球人脸识别算法测试(FRVT)中,精度甚至可以达到在千万分之-误报下的识别准确率超过99%。国内企业在人脸识别算法领域具有领先优势,依图、商汤、旷视、大华等在测试结果中领先。然而人脸算法虽然在各种数据集的测试中准确率颇高,但还远没达到在商业应用中的满意程度。2D vs 3D解决方案:人脸识别市场的解决方案主要包括2D识别、3D识别技术。目前2D识别方案占主流,但由于人的脸部并非平坦,2D识别在将3D人脸信息平面化投影的过程中存在特征信息损失,而3D识别使用三维 人脸立体建模方法,可最大程度保留有效信息,比2D算法更合理并拥有更高精度,成为未来技术发展趋势之一。

    浏览量693人已浏览 发布时间2021-05-24 69页 推荐指数推荐指数推荐指数推荐指数推荐指数5星级
  • 弗若斯特沙利文:人工智能行业中国AI语音识别市场研究报告(19页).pdf

    报告提供的任何内容(包括但不限于数据、文字、图表、图像等)均系弗若斯特沙利文公司独有的高度机密性文件(在报告中另行标明出处者除外)。未经弗若斯特沙利文公司事先书面许可,任何人不得以任何 方式擅自复制、再造、传播、出版、引用、改编、汇编本报告内容,若有违反上述约定的行为发生,弗若斯特沙利文公司保留采取法律措施,追究相关人员责任的权利。弗若斯特沙利文开展的所有商业活动均使用“弗 若斯特沙利文”或“Frost&Sullivan”的商号、商标,弗若斯特沙利文无任何前述名称之外的其他分支机构,也未授权或聘用其他任何第三方代表弗若斯特沙利文开展商业活动。中国 AI 语音识别市场研究报告 A Frost&Sullivan Research Report 2020 Frost&Sullivan.All rights reserved.This document contains highly confidential information and is the sole property of Frost&Sullivan No part of it may be circulated,quoted,copied or otherwise reproduced without the written approval of Frost&Sullivan 1 目录 1 AI 语音识别定义与解读.2 2 中国 AI 语音识别市场概览.3 2.1 AI 语音识别产业链分析.3 2.2 中国 AI 语音识别市场驱动因素.5 2.2.1 需求端:下游需求增长,AI 语音识别市场空间稳步提高.5 2.2.2 技术端:算力、算法、大数据升级,AI 语音识别准确率持续提升.6 2.2.3 政策端:人工智能上升至国家战略地位,AI 语音识别行业加快布局和落 地.7 2.3 中国 AI 语音识别市场趋势洞察.7 2.3.1 云计算渐发展,商业化前景更广阔.7 2.3.2 多技术协同发展,语音交互更生动.8 2.3.3 语音技术渐开放,普惠生态更繁荣.10 2.4 中国 AI 语音识别市场关键成功因素分析.10 3 中国 AI 语音识别市场竞争格局分析.12 3.1 沙利文企业增长评价数据来源和研究主体.12 3.2 中国 AI 语音识别市场企业增长评价结果及分析.12 3.3 沙利文评价模型的设计.16 4 研究方法和研究范围介绍.18 4.1 研究方法.18 4.2 研究范围.18 nMoRmMnQxOoPpQmNsPmRsM7NdN8OtRrRsQnNlOrQoMeRpOyQ6MrQrPuOqQqMNZqRpP 2020 Frost&Sullivan.All rights reserved.This document contains highly confidential information and is the sole property of Frost&Sullivan No part of it may be circulated,quoted,copied or otherwise reproduced without the written approval of Frost&Sullivan 2 1 AI 语音识别定义与解读 语音识别是人机交互的入口,是指机器/程序接收、解释声音,或理解和执行口头命令 的能力。在智能时代,越来越多的场景在设计个性化的交互界面时,采用以对话为主的交互 形式。一个完整的对话交互是由“听懂理解回答”三个步骤完成的闭环,其中,“听懂”需要语音识别(Automatic Speech Recognition,ASR)技术;“理解”需要自然语 言处理(Natural Language Processing,NLP)技术;“回答”需要语音合成(Text To Speech,TTS)技术。三个步骤环环相扣,相辅相成。语音识别技术是对话交互的开端,是保证对话 交互高效准确进行的基础。语音识别技术自 20 世纪 50 年代开始步入萌芽阶段,发展至今,主流算法模型已经经 历了四个阶段:包括模板匹配阶段、模式和特征分析阶段、概率统计建模阶段和现在主流的 深度神经网络阶段。目前,语音识别主流厂商主要使用端到端算法,在理想实验环境下语音 识别准确率可高达 98%以上。图 1-1:AI 语音识别发展历程 来源:fsTEAM软件采编,沙利文研究院绘制 2020 Frost&Sullivan.All rights reserved.This document contains highly confidential information and is the sole property of Frost&Sullivan No part of it may be circulated,quoted,copied or otherwise reproduced without the written approval of Frost&Sullivan 3 2 中国 AI 语音识别市场概览 2.1 AI 语音识别产业链分析 中国 AI 语音识别市场参与者众多,主要分为上游、中游、下游。图 2-1:AI 语音识别产业链分析 来源:沙利文研究院绘制 上游:底层技术提供强力支撑,云计算助推 AI 语音应用普及-语音识别解码过程中包 含了声学模型和语言模型的识别建模和模型训练两个部分。在运行过程中训练数据量和 计算量需求极大,传统的 CPU 或者单一处理器几乎无法快速单独完成一个完整的模型 训练过程,主要原因在于 CPU 内部仅含少量逻辑单元,且指令执行是逐一进行的串行 计算,使用该架构进行语音识别运算的处理时间过长,无法满足海量数据计算的实时性 需求。因此,能提供海量数据处理、存储以及高性能运算能力的云计算技术成为语音识 别行业的应用热点。目前,主流语音识别公司的模型训练和语音识别基本都在云端采用 GPU 并行架构或异构计算方案进行。2020 Frost&Sullivan.All rights reserved.This document contains highly confidential information and is the sole property of Frost&Sullivan No part of it may be circulated,quoted,copied or otherwise reproduced without the written approval of Frost&Sullivan 4 中游:语音技术持续升级,生态圈建立赋能产业-语音识别的中游主要为将语音识别技 术实现商业化落地的硬件及软件服务供应商。根据终端消费者类型,语音识别的中游厂 商主要可以分为消费级市场和专业级市场,其中消费级市场中的主要语音识别产品包括 消费级智能硬件、智能音箱及语音输入法等,专业级市场的语音识别产品则主要以行业 解决方案(以项目制交付的软 硬件产品及服务)和平台化技 术输出(SDK 或 API 形式的 智能语音开放平台)两种形式 呈现,其中更为垂直落地的解 决方案形式在目前专业级商 业化收入市场中占比更高。目 前,智能语音开放平台在智能 语音市场中收入占比较小的 主要原因是,以阿里、百度及 科大讯飞为首的各大厂商为加速 AI 语音技术对下游应用场景的渗透,采用多种优惠甚 至免费形式向开发者提供语音识别服务,希望将语音识别技术应用在更多软件及场景中,与广大开发者携手建立一个完整的 AI 产业生态圈。下游分析:行业应用多样化,一站式服务需求广-语音识别作为 AI 交互的重要入口,在人工智能领域属于最重要和发展最为成熟的技术之一,目前已经以多种商业化形式广 泛应用于下游市场。从应用领域来看,目前消费级市场主要应用于智能硬件、智能家居、智慧教育、车载系统等领域,专业级市场主要应用于医疗、公检法、教育、客服、语音 审核等领域。广泛的应用领域也就意味着更加多元化的使用场景,然而目前的语音识别 技术对于使用场景具有较强的限制性。尽管快速更新迭代的神经网络结构已经将安静环 境下的近场语音识别的错误率降低至 3%以下,但现实环境中多数应用场景无法满足理 想的环境条件,因此在进行语音识别时需要同时考虑到各种噪声、信道等因素。为使语 音识别技术在更广泛的使用场景下保持良好的表现,AI 语音厂商需要提供硬件与软件 协同的一站式服务,并根据用户实际痛点进行针对性优化,从而有效提升在多元下游场 景下语音识别的渗透率。2020 Frost&Sullivan.All rights reserved.This document contains highly confidential information and is the sole property of Frost&Sullivan No part of it may be circulated,quoted,copied or otherwise reproduced without the written approval of Frost&Sullivan 5 2.2 中国 AI 语音识别市场驱动因素 2.2.1 需求端:下游需求增长,AI 语音识别市场空间稳步提高 在过去五年间,中国 AI 语音的需求最先在消费级市场爆发,主要得益于互联网及智能 硬件设备厂商加大语音识别的投入经费,以及厂商为提前占据市场推行的智能音箱硬件补 贴。目前,消费级产品及服务主要包括智能音箱、智能车载和智能硬件及消费级互联网增值 服务。然而,目前包括直接面向消费者的产品及服务在内,语音识别的相关应用及使用场景 仍具有局限性。未来,在消费级产品供应商和开发者共同构建产业生态圈的过程中,语音识 别技术将更好地与其他语音交互技术及软件功能融合,为消费者提供更优质的体验,未来 AI 语音识别市场将迎来广阔的发展空间。对于专业级市场而言,主要的产品形式包括智能语音开放平台和行业解决方案,下游应 用领域目前主要包括数字化水平相对较高的智慧医疗、智慧教育、企业客服、司法政务、金 融领域等。AI 语音识别作为人机交互的重要入口之一,除了在语音识别的领域表现出色外,也要能更好地与其他智能语音技术(包括语义理解、远场语音识别、唤醒目标检测、全双工 交互、个性化识别技术等)进行融合,从而综合提升真实场景中的用户体验。近年来 AI 语 音识别专业级市场的快速增长主要原因除了深度神经网络算法为语音识别带来的准确率大 幅提升外,更重要的是其他智能语音和 AI 技术的发展带来了更广阔的应用场景,预计未来 专业级市场的商业化需求将得到进一步释放。2020 Frost&Sullivan.All rights reserved.This document contains highly confidential information and is the sole property of Frost&Sullivan No part of it may be circulated,quoted,copied or otherwise reproduced without the written approval of Frost&Sullivan 6 图 2-2:中国 AI 语音识别市场商用收入规模,2015-2024 年预测 *统计对象包括:1)专业级市场:智能语音行业解决方案、智能语音开放平台等;2)消费级市场:与语音识别直接相关的硬 件设备,如智能音箱及相应消费级软件和服务如个性化教与学平台、语音输入法、智慧考试等。以上商业化收入仅包括智能语音 直接相关收入,硬件收入及其他技术相关收入不纳入本市场规模。来源:沙利文研究院绘制 2.2.2 技术端:算力、算法、大数据升级,AI 语音识别准确率持续提升 在过去 5-10 年,AI 语音识别技术的快速商业化的主要原因在于技术端的快速发展,如 计算能力的提升、算法框架的优化和大数据的升级等。图 2-3:中国 AI 语音识别市场技术发展情况 来源:沙利文研究院绘制 从计算能力来看,芯片处理能力的大幅提升、GPU 的大量应用、云服务的普及还 有硬件价格的快速下降共同为人工智能计算能力的提升提供了重要支撑;从算法框架来看,目前主流语音识别模型已经以深度神经网络为主导,神经网络的 出现及普及为语音识别准确率的提升起到了重要作用;2020 Frost&Sullivan.All rights reserved.This document contains highly confidential information and is the sole property of Frost&Sullivan No part of it may be circulated,quoted,copied or otherwise reproduced without the written approval of Frost&Sullivan 7 从大数据来看,更加贴近真实使用场景的语料库也为语音识别技术提供了更加有效 的训练素材,从而大幅提升了 AI 语音识别产品及服务的使用体验。以上底层技术 的升级,为语音识别技术的准确率提升及商用渗透提供了强大的市场驱动力。2.2.3 政策端:人工智能上升至国家战略地位,AI 语音识别行业加快布局和落地 人工智能发展水平一定程度上体现了各国最高的科技水平。考虑到人工智能发展对于国 家经济发展的重要性,中国政府已针对人工智能行业颁布了多项国家层面的发展政策,自 2017 年以来人工智能行业已经连续三年被写入全国政府工作报告内。具体支持政策包 括项目发展基金、人才引进政策及其他国家扶持政策。目前,语音识别技术属于中国 AI 领 域中最为成熟落地的技术之一,在国家政策的强力扶持下,预计未来能够加速在垂直行业的 渗透和布局。同时,在中国制造 2025的大背景和智能经济新形态下,各省市响应中央号召,截 至 2019 年上半年,已有 30 多个省市发布人工智能相关规划或专项政策,以人工智能为技 术手段,发挥当地产业集群优势,促进产学研融合及协同发展。图 2-4:国家及地方相关政策及影响 来源:沙利文研究院绘制 2.3 中国 AI 语音识别市场趋势洞察 2.3.1 云计算渐发展,商业化前景更广阔 AI 语音识别发展至今,主流算法模型已经从模板匹配阶段转变为深度神经网络阶段。在深度神经网络算法下,考虑到训练过程中大量数据的使用,计算量巨大,对于应用企业而 言,采用本地计算方式的算力门槛过高。而在当下的智能时代,日渐普及的云计算环境提升 2020 Frost&Sullivan.All rights reserved.This document contains highly confidential information and is the sole property of Frost&Sullivan No part of it may be circulated,quoted,copied or otherwise reproduced without the written approval of Frost&Sullivan 8 AI 语音识别运算效率的同时也降低了企业的进入门槛,因此大大促进了 AI 语音的技术发展。语音识别终端把采集到的语音片段进行模数转换后,进行传送和决策,然后通过通信网络将 语音数据上传至云端进行语音识别,反馈结果至语音识别终端。在此过程中,云计算可以完 成语音数据库和语言数据库的训练,最高效输出反馈结果,促进 AI 语音技术的准确率提高。基于云计算的发展,部分头部厂商也在逐渐推出基于云上的语音产品,商业化落地的步 伐正在加快。在个别应用场景领域中,基于云计算的 AI 语音技术应用市场销售规模已近乎 领先于头部的基于传统硬件厂商所服务的市场规模,大量的独立软件开发商(ISV)趋于与 云上语音技术厂商达成合作关系,从而在低成本的情况下在云开放平台上获取最前沿的云上 智能语音技术和行业语音解决方案。例如,现已有超过 5 万家语音客户与阿里云智能语音达 成合作,覆盖多行业场景,包括中国移动、中央电视台、招商银行在内的传统行业的大型企 业。其中,在电话客服行业,与阿里达成合作的独立软件开发商(ISV)头部 8 家用户年销 售额接近 6 亿元人民币,在法院语音识别市场的联盟商 2019 年也达到年销售额 1.6 亿元人 民币。基于云计算的 AI 语音技术能够满足 ISV 在录音文件识别、实时语音识别、一句话识 别、语音自学习平台、短文本语音合成、长文本语音合成、语音唤醒、声纹识别、语音模组 和语音交互 SDK 等方面的技术需求,从而支撑他们实现和拓展更多的应用场景和渠道的发 展需求。2-5:神经网络模拟对于内存大小和计算能力的需求关系 来源:沙利文研究院绘制 2.3.2 多技术协同发展,语音交互更生动 语音识别属于人工智能中的感知智能,其核心功能是将物理世界的信息转化成可供计算 2020 Frost&Sullivan.All rights reserved.This document contains highly confidential information and is the sole property of Frost&Sullivan No part of it may be circulated,quoted,copied or otherwise reproduced without the written approval of Frost&Sullivan 9 机处理的信息,为后续的认知智能提供基础。因此,语音识别作为人工智能的重要感知入口,除了实现本身的单一功能外,还可以与其他 AI 技术进行深度集成,应用于更广泛的生活场 景中。通过前端语音交互提供入口,后端互联网提供服务,多种技术协同发展的形式,不仅 为单一的技术赋能,同时也能推动 AI 语音相关产业创新,有利于未来新兴产业的崛起。如 服务机器人、智能客服等新兴产业在 AI 语音识别的技术推动下正在快速发展。技术融合已成为当下的趋势,只有将多种技术充分结合,才能为用户带来更多价值。以 公检法领域为例,通过融合声学信号、模式识别、自然语言处理、语音合成等技术,可以实 现智慧庭审、电信网络反欺诈、虚拟法官、声纹研判、智能接警、警务智能语音服务等功能,为公检参与者提供全面高效的服务。2-6:语音识别在公检法领域的技术应用和落地 来源:沙利文研究院绘制 2020 Frost&Sullivan.All rights reserved.This document contains highly confidential information and is the sole property of Frost&Sullivan No part of it may be circulated,quoted,copied or otherwise reproduced without the written approval of Frost&Sullivan 10 2.3.3 语音技术渐开放,普惠生态更繁荣 智能硬件的“智能”,体现在强大的感知能力、机器学习、自然语言理解等,而这些功 能十分依赖于大数据以及云计算技术的支撑。目前,这些数据和计算资源基本掌握在大型 AI 语音识别厂商手中,给 AI 生态的发展无形中增设了障碍。此外,随着云计算、语音识别等 技术的发展,AI 语音将会渗透到各行各业中,但各行业都有其独特的属性,很难有一套通用 的 AI 语音技术适用于所有的行业。为适应多元化的行业应用,提高开发和应用效率,应渐 渐将定制化模型的能力开放,使开发者在前期模型训练阶段不拘于行业属性,授之以渔,定 制行业特属的算法模型,真正做到普惠生态。目前,AI 语音厂商正在逐渐开始构建这样的普惠生态。AI 语音厂商基于可靠的技术基 础赋能行业升级,将软件和硬件结合,提供芯片端到语音算法、平台的一站式服务,并通过 平台化的方式开放其智能语音算法能力,赋能其他合作伙伴高效开发针对性产品,为整个 AI 生态提供更为普惠的服务。打造 AI 普惠生态系统,需与 AI 行业独立软件开发商达成合作,为其落地更多的应用场景降低技术使用门槛,合作提供更贴近需求、量身定制的解决方案。如,百度宣布语音技术接口永久免费开放,提供语音识别、语音合成、语音唤醒多平台 SDK。阿里云的语音交互技术服务平台聚焦语音的核心能力,在基于云的开放式平台为中下游 ISV 提供包括了语音原子能力、开箱即用的行业模型、和自学习平台的一站式服务,紧靠行业伙 伴,和客户一起打造更贴近需求的产品。2.4 中国 AI 语音识别市场关键成功因素分析 (1)强劲的技术支撑 AI 语音从技术突破开始,到应用、终端和场景的不断突破,再回归到技术,渐成业内共 识。语义识别的加入、知识图谱的构建和技术的快速迭代,也为智能语音技术进入 3%红线、未来将会是一个普惠 AI 的时代,技术开放将会是 AI 时代非常重要的特点。尤其在当 下竞争激烈的环境下,AI 语音识别厂商技术的差距已经不太明显,若能开放语音技术能力,赋能行业伙伴,不仅可以快速抓住用户,占领市场,还能促进整体生态繁荣发展,普惠 AI 应用市场。2020 Frost&Sullivan.All rights reserved.This document contains highly confidential information and is the sole property of Frost&Sullivan No part of it may be circulated,quoted,copied or otherwise reproduced without the written approval of Frost&Sullivan 11 甚至达到更高识别率带来可能。但由于 AI 语音发展时间从整个技术长河的角度而言并不长,从听清逐渐实现听懂,最后到满足用户随心所欲,还需要不断的训练、试验,以及技术迭代。对于语音识别厂商而言,如何在现实场景下将声学、语言学等多学科技术融合,实现技术迭 代和算法提升,从而提供高准确率的语音识别服务是行业的重要成功因素之一。(2)充足的语料积累 如果算法是 AI 语音技术的引擎,那么 数据就是燃料。算法需要庞大的基于真实 场景的数据,并需要对数据进行相对精确 的标注,例如在建立声纹识别训练库时,至少要保证性别比例分布为 50%5%,并包含有不同年龄区间、不同地域口音等 训练样本。因此,对于语音识别厂商而言,在垂直行业的真实场景下积累充足且及时的语音资料和文本资料,并针对以上资料加以标注 及进行实时更新和迭代是优化用户体验和提升客户粘度的关键,也是语音识别行业的关键成 功因素之一。(3)丰富的场景土壤 应用、终端和场景带来了大量应用数据,更为应用于语音识别的机器学习、深度学习带 来了技术突破。语音识别技术的爆发是源于大数据,数据量越多,语音识别的算法准确性越 高,语音识别的识别准确率相应越高。此外,语音识别技术需要丰富的场景土壤来培养快速的复杂场景处理能力。在消费级用 户需求方面,从语音识别技术发展的开端起,消费者对语音识别技术就建立了高预期希 望利用新技术提升生活体验,将高准确率的语音识别技术创造性地融入到日常场景中。在专 业级用户需求方面,司法、医疗、教育、电信、交通等行业的政府及企业级用户需要语音识 别系统在实际业务应用中表现出功能的可靠性和稳定性,因此,这些专业级用户在选择语音 识别产品时会通过严格的招投标选择最具实力和行业经验的 AI 语音识别产品和服务供应商。面对日益提升的用户需求,AI 语音厂商需要积累丰富的场景经验,培养快速的复杂场景处 理能力以赢得市场。此外,用户在选定语音识别供应商后倾向于长期保持稳定合作,这也成 为 AI 语音厂商持续成功的因素之一。2020 Frost&Sullivan.All rights reserved.This document contains highly confidential information and is the sole property of Frost&Sullivan No part of it may be circulated,quoted,copied or otherwise reproduced without the written approval of Frost&Sullivan 12 3 中国 AI 语音识别市场竞争格局分析 沙利文通过深度访谈和调研市场领导者、参与者、用户及行业专家,查阅公开信息,制 定了一套包含一系列评价指标和市场权重的评价体系,并基于该评价体系对中国 AI 语音主 流厂商进行客观公正的评估,分析其在中国的 AI 语音行业中增长力情况。3.1 沙利文企业增长评价数据来源和研究主体 本报告数据来源于技术指数、市场数据及 AI 语音识别领域开发人员、服务提供商、行 业专家访谈等。报告的评价主体为中国 AI 语音识别主流厂商,按照其属性可分为 IT 及互联 网厂商、语音技术厂商两类。IT 及互联网厂商:IT 及互联网厂商包括百度、阿里巴巴、腾讯、搜狗、小米等厂商。凭 借在互联网时代积累的大量资本,IT 及互联网厂商在众多科技领域成为了主要“玩家”。与 依靠人工智能技术起家的 AI 垂直类企业不同,IT 及互联网厂商侧重以流量导向和满足庞大 用户群体验和创新需求为导向去做 AI 开发,更多以流量和用户体验为导向,同时,会比传 统的语音技术厂商更加注重创新实践。语音技术厂商:语音技术厂商也可以分为传统语音技术厂商和创业厂商。传统语音技术 厂商包括科大讯飞、小 i 机器人、捷通华声等拥有自己的核心智能语音芯片以及语音相关软 件系统的厂商,创业厂商包括云知声、思必驰、出门问问等专注于某些垂直领域(如汽车、家电)来推广自己的语音技术和产品的厂商。3.2 中国 AI 语音识别市场企业增长评价结果及分析 通过以上评价和模型计算,可得到综合评价值排名如下:2020 Frost&Sullivan.All rights reserved.This document contains highly confidential information and is the sole property of Frost&Sullivan No part of it may be circulated,quoted,copied or otherwise reproduced without the written approval of Frost&Sullivan 13 图 3-2:中国 AI 语音识别主流厂商竞争力分析 注:圆点颜色深浅代表客户指数值大小,圆点颜色最深表示客户指数值越大。来源:沙利文研究院绘制 基于本模型的评价,阿里巴巴、科大讯飞、百度、腾讯均处于高竞争力区间,且阿里巴 巴在增长指数、创新指数和客户指数三个维度均表现突出。阿里巴巴 阿里智能语音交互服务是业内领先的云原生语音服务平台,对阿里集团内服务于超 过 99%语音场景,对外提供各类云上语音产品。(1)在语音识别全产业链方面,阿里云上语音技术在大量数据积累、算法积累(独创的下一代端到端语音识别技术 SCAMA和SAN-M算法,基于CPU服务器的高并发、实时/离线的语音识别能力)、工程积累(云原生 AI 技术、大规模弹性计算能力、支撑集团日均数亿请求服务能 力)和阿里达摩研究院技术同步上云的成果加持下,极大提升了语音交互的准确率 和性能。阿里智能语音交互技术已解锁并成熟布局于多个场景中,包括智能客服、2020 Frost&Sullivan.All rights reserved.This document contains highly confidential information and is the sole property of Frost&Sullivan No part of it may be circulated,quoted,copied or otherwise reproduced without the written approval of Frost&Sullivan 14 智能质检、法庭庭审实时记录、实时演讲字幕、访谈录音转写、声纹登录、设备端 语音交互等场景,在政务、金融、物流、教育、电商、泛互联网、医疗、餐饮等多 个领域均有应用案例以及大量的客户积累。(2)在市场方面,目前,阿里智能语音 分别在电话客服和法院语音识别的市场都占有领先的位置(全国电话客服领域最大 的技术合作联盟;与法院行业龙头应用厂商全部达成合作,覆盖近万间线下法庭和 超过 1.5 万间线上法庭)。阿里语音 AI 技术能在多个领域快速落地,占领市场并成 为行业客户认知度前列的云上语音技术厂商,关键因素之一是与大量的独立软件开 发商结成了阿里语音 AI 产业联盟。(3)在企业经营和战略方面,阿里云在基建技 术的布局的规划方面,将继续加大投入对云操作系统、服务器、芯片、网络等核心 技术的研发,为语音 AI 与云的结合带来更具想象力的应用场景和价值空间。(4)在创新方面,阿里重视对技术的研发投入和对行业的贡献,其智能语音核心技术能 力是下一代端到端语音识别技术 SCAMA、SAN-M 技术和 DFSMN 技术,这几种 技术都是业界首次在非科研领域的应用并大获成功,使得语音交互的准确率在高并 发的情况下获得大幅度提高,同时 DFSMN 也已经面向全行业开源,为行业整体技 术提升做贡献。另外阿里在业内率先推出了自学习平台改变了语音 AI 生产关系,即便行业实践者没有很多语音领域的专业知识,也能够用阿里云自学习产品,通过 灌注入行业内的数据和知识,就能够获得所在行业不错的语音交互效果。阿里的智 慧大屏解锁了多场景,阿里达摩院全球首创多模态语音交互方案,与语音、视觉、自然语言理解多 AI 技术融合,能够实现在强噪音环境下的免唤醒人机交互,具有 突出的产品组合优势。(5)在客户服务方面,阿里智能语音最为主要的商业策略就 是为其合作伙伴提供语音原子能力、多个领域开箱即用模型和“自学习”平台,赋 予客户产品“能说、会说、懂你”式的智能人机交互体验,在将阿里云智能语音领 2020 Frost&Sullivan.All rights reserved.This document contains highly confidential information and is the sole property of Frost&Sullivan No part of it may be circulated,quoted,copied or otherwise reproduced without the written approval of Frost&Sullivan 15 先技术赋能于产业联盟企业的同时,构建一个更为完整和繁荣的 AI 语音应用生态,普惠市

    浏览量541人已浏览 发布时间2020-12-21 19页 推荐指数推荐指数推荐指数推荐指数推荐指数5星级
  • 中国电子技术标准化研究院:2020年行为识别行业研究报告(38页).pdf

    SA C/TC2 8/SC37 1 SA C/TC2 8/SC37 2 SA C/TC2 8/SC37 3 SA C/TC2 8/SC37 4 SA C/TC2 8/SC37 5 SA C/TC2 8/SC37 6 SA C/TC2 8/SC37 7 SA C/TC2 8/SC37 8 SA C/TC2 8/SC37 9 SA C/TC2 8/SC37 10 SA C/TC2 8/SC37 11 SA C/TC2 8/SC37 12 SA C/TC2 8/SC37 13 SA C/TC2 8/SC37 14 SA C/TC2 8/SC37 15 SA C/TC2 8/SC37 16 SA C/TC2 8/SC37 17 SA C/TC2 8/SC37 18 SA C/TC2 8/SC37 19 SA C/TC2 8/SC37 20 SA C/TC2 8/SC37 21 SA C/TC2 8/SC37 22 SA C/TC2 8/SC37 23 4.2 行为识别相关标准 4.2.1 国际标准 表 1 国内行为识别相关标准 序号标委会标准号/计划号标准中文名称状态 1 ISO/IEC JTC 1/SC37 ISO/IEC 19784-1:2018 信息技术-生物识别应用程序编 程接口-第1部分:BioAPI规范 现行 2ISO/IEC CD 39794-16 信息技术-可扩展生物识别数据 交换格式-第16部分:全身图像 数据 起草 3 ISO/IEC DIS 39794-17.2 信息技术-可扩展生物识别数据 交换格式第17部分:步态图像 序列数据 起草 4 ISO/IEC 19795-2:2007/Amd 1:2015 多模态生物识别实施的测试现行 5 ISO/IEC 30137-1:2019 信息技术-视频监视系统中生物 识别技术的应用-第1部分:系 统设计和规范 现行 6 ISOIEC TR 29195:2015 EN 自动边界中用于生物特征识别的 旅行者过程 现行 SA C/TC2 8/SC37 24 4.2.2 国内标准 表 2 国内行为识别相关标准 序号标委会标准号/计划号标准中文名称状态 1 SAC/TC100 20020028-Q-312视频安防监控数字录像设备正在批准 220090054-T-312 安防监控视频实时智能分析 设备技术要求 正在批准 3GB 20815-2006 视频安防监控数字录像设备现行 4GA/T 1354-2018 安防视频监控车载数字录像 设备技术要求 现行 5GA/T 647-2006 视频安防监控系统前端设备 控制协议V1.0 现行 6GB/T 30147-2013 安防监控视频实时智能分析 设备技术要求 现行 7 SAC/TC28/SC37 信息技术 可扩展生物特征识 别数据交换格式 第16部分:全身数据 起草 8 信息技术 可扩展生物特征识 别数据交换格式 第17部分:步态数据 起草 4.3 小结 行为识别是生物特征识别领域的重要技术之一,但相对于人脸识别、指纹 识别起步较晚。无论是国际标准化组织还是国内标准化组织,在行为识别领域 的标准化研究尚不完善。目前,基于人工智能的行为识别技术发展迅速,未来将为安防、教育、体 育、医疗等领域提供新的生物特征识别技术支持与保障,正引起业界与市场的 广泛关注与追捧。但目前行为识别并没有形成统一的技术标准,各技术开发商 和应用商自成体系,上下游标准难以统一,互联互通性差,导致制造、开发和 适配成本居高不下。此外,用于行为识别训练的数据集的收集、存储、处理和 SA C/TC2 8/SC37 25 SA C/TC2 8/SC37 26 SA C/TC2 8/SC37 27 SA C/TC2 8/SC37

    浏览量103人已浏览 发布时间2020-12-01 29页 推荐指数推荐指数推荐指数推荐指数推荐指数5星级
  • 中国电子技术标准化研究院:2020年基因组识别行业研究报告(36页).pdf

    SA C/TC2 8/SC37 1 SA C/TC2 8/SC37 2 SA C/TC2 8/SC37 3 SA C/TC2 8/SC37 4 SA C/TC2 8/SC37 5 SA C/TC2 8/SC37 6 SA C/TC2 8/SC37 7 SA C/TC2 8/SC37 8 SA C/TC2 8/SC37 9 SA C/TC2 8/SC37 10 SA C/TC2 8/SC37 11 SA C/TC2 8/SC37 12 SA C/TC2 8/SC37 13 SA C/TC2 8/SC37 14 SA C/TC2 8/SC37 15 SA C/TC2 8/SC37 16 SA C/TC2 8/SC37 17 SA C/TC2 8/SC37 18 SA C/TC2 8/SC37 19 SA C/TC2 8/SC37 20 SA C/TC2 8/SC37 21 SA C/TC2 8/SC37 22 表 1 基因组识别相关国际标准 序号标准编号标准名称标准中文名称 1ISO/IEC 19794-14 Information technology Biometric data interchange formats Part 14:DNA data 信息技术生物测定数 据交换格式第14部分:DNA数据 2 ISO/IEC 29109-1:2009/Cor 1:2010 Information technology Conformance testing methodology for biometric data interchange formats defined in ISO/IEC 19794 Part 1:Generalized conformance testing methodology Technical Corrigendum 1 ISO/IEC 19794中定义 的生物测定数据交换 格式的一致性测试方 法第1部分:通用一 致性测试方法技术勘 误1 3ISO/IEC 2382:2015 Information technology Vocabulary 信息技术 词汇 4 ISO/IEC 19784-2:2007 Information technology Biometric application programming interface Part 2:Biometric archive function provider interface 信息技术生物测定应 用程序编程接口第2 部分:生物测定存档 功能提供接口 5 ISO/IEC 19795-7:2011 Information technology Biometric performance testing and reporting Part 7:Testing of on-card biometric comparison algorithms 信息技术.生物测定性 能试验和报告.第7部 分:卡上生物测定比 较算法的试验 4.2.国内标准化情况 4.2.1.标准化组织情况 国内标准化组织方面,主要是全国信息技术标准化技术委员会生物特征识 别分技术委员会(SAC/TC28/SC37)、全国信息安全标准化技术委员会鉴别与 授权工作组(SAC/TC260/WG4)、全国生化检测标准化技术委员会(TC387)、国家标准物质研究中心、全国生物样本标准化技术委员会(TC559)、中国标 准化研究院(424-cnis)负责生物特征识别及基因组测序相关标准的制定。其 中SAC/TC28/SC37成立了基因组识别工作组,发布了基因组识别数据交换格 SA C/TC2 8/SC37 23 式标准,正在制定DNA样本质量、基因组分型系统等标准。SAC/TC260/WG4 正在制定基因组识别数据安全要求等标准。4.2.2.标准情况 SAC/TC28/SC37制定基因组识别技术相关标准共计3项,其中1项已发布,2项正在立项中;SAC/TC260制定基因组识别标准1项;TC387制定基因组测 序标准8项;国家标准物质研究中心制定基因组测序标准3项;TC559制定基 因组测序标准7项;424-cnis制定基因组测序标准4项,具体情况见表2。表 2 基因组识别相关国家标准及标准计划 序号标准组织标准编号/计划号标准名称标准状态 1 SAC/TC28/SC37 GB/T 26237.14-2019 信息技术 生物特征识别数据交 换格式 第14部分:DNA数据 已发布 22020101150 信息技术 生物特征样本质量 第14部分:DNA数据 立项中 32020101053 信息技术 生物特征识别基因组 分型系统规范 立项中 4SAC/TC260 信息安全技术 基因识别数据安 全要求 草案 5 TC387 GB/T 34797-2017核酸引物探针质量技术要求已发布 6GB/T 34798-2017核酸数据库序列格式规范已发布 7GB/T 30989-2014高通量基因测序技术规程已发布 8GB/T 35537-2017高通量基因测序结果评价要求已发布 9GB/T 35890-2018高通量测序数据序列格式规范已发布 1020184465-T-469 磁珠法DNA提取纯化试剂盒 检测通则 已发布 1120184468-T-469 环境微生物宏基因组检测 高通 量测序法 已发布 1220184467-T-469 哺乳动物细胞交叉污染检测方 法通用指南 已发布 SA C/TC2 8/SC37 24 序号标准组织标准编号/计划号标准名称标准状态 13 国家标准物 质研究中心 GB/T 37870-2019 个体鉴定的高通量测序方法已发布 14GB/T 37872-2019 目标基因区域捕获质量评价通 则 已发布 15GB/T 37873-2019合成基因质量评价通则已发布 16 TC559 GB/T 38576-2020人类血液样本采集与处理已发布 17GB/T 38735-2020人类尿液样本采集与处理已发布 18GB/T 37864-2019 生物样本库质量和能力通用要 求 已发布 19GB/T 38736-2020人类生物样本保藏伦理要求已发布 2020192118-T-469人类生物样本库管理规范草案 2120193179-T-469人类生物样本库基础术语草案 2220162694-T-424 高通量测序技术检测核酸类样 品通用技术规范 草案 23 424-cnis GB/T 38551-2020植物品种鉴定 MNP标记法已发布 24GB/T 38570-2020 植物转基因成分测定 目标序列 测序法 已发布 25GB/T 29859-2013生物信息学术语已发布 2620183201-Z-424 精准扶贫 大闸蟹项目运营管理 规范 草案 5.问题及建议 5.1.传统鉴定技术遭遇挑战 SA C/TC2 8/SC37 25 SA C/TC2 8/SC37 26 SA C/TC2 8/SC37

    浏览量57人已浏览 发布时间2020-12-01 28页 推荐指数推荐指数推荐指数推荐指数推荐指数5星级
  • 中国电子技术标准化研究院:2020年人脸识别行业研究报告(56页).pdf

    SA C/TC2 8/SC37 1 2020 年人脸识别行业研究报告 SA C/TC2 8/SC37 2 Face Recognition Industry Research Report SA C/TC2 8/SC37 3 2020 年人脸识别行业研究报告 SA C/TC2 8/SC37 4 Face Recognition Industry Research Report SA C/TC2 8/SC37 5 2020 年人脸识别行业研究报告 SA C/TC2 8/SC37 6 Face Recognition Industry Research Report SA C/TC2 8/SC37 7 2020 年人脸识别行业研究报告 SA C/TC2 8/SC37 8 Face Recognition Industry Research Report SA C/TC2 8/SC37 9 2020 年人脸识别行业研究报告 SA C/TC2 8/SC37 10 Face Recognition Industry Research Report SA C/TC2 8/SC37 11 2020 年人脸识别行业研究报告 SA C/TC2 8/SC37 12 Face Recognition Industry Research Report SA C/TC2 8/SC37 13 2020 年人脸识别行业研究报告 SA C/TC2 8/SC37 14 Face Recognition Industry Research Report SA C/TC2 8/SC37 15 2020 年人脸识别行业研究报告 SA C/TC2 8/SC37 16 Face Recognition Industry Research Report SA C/TC2 8/SC37 17 2020 年人脸识别行业研究报告 SA C/TC2 8/SC37 18 Face Recognition Industry Research Report SA C/TC2 8/SC37 19 2020 年人脸识别行业研究报告 SA C/TC2 8/SC37 20 Face Recognition Industry Research Report SA C/TC2 8/SC37 21 2020 年人脸识别行业研究报告 SA C/TC2 8/SC37 22 Face Recognition Industry Research Report SA C/TC2 8/SC37 23 2020 年人脸识别行业研究报告 SA C/TC2 8/SC37 24 Face Recognition Industry Research Report SA C/TC2 8/SC37 25 2020 年人脸识别行业研究报告 SA C/TC2 8/SC37 26 Face Recognition Industry Research Report SA C/TC2 8/SC37 27 2020 年人脸识别行业研究报告 SA C/TC2 8/SC37 28 Face Recognition Industry Research Report SA C/TC2 8/SC37 29 2020 年人脸识别行业研究报告 SA C/TC2 8/SC37 30 Face Recognition Industry Research Report SA C/TC2 8/SC37 31 2020 年人脸识别行业研究报告 SA C/TC2 8/SC37 32 Face Recognition Industry Research Report SA C/TC2 8/SC37 33 2020 年人脸识别行业研究报告 SA C/TC2 8/SC37 34 Face Recognition Industry Research Report SA C/TC2 8/SC37 35 2020 年人脸识别行业研究报告 SA C/TC2 8/SC37 36 Face Recognition Industry Research Report 此外,国内组织方面,全国防伪标准化技术委员会(SAC/TC218)发布了 国家标准生物特征识别防伪技术要求 第1部分:人脸识别。全国金融标 准化技术委员会(SAC/TC 180)发布了国家标准金融服务 生物特征识别 安 全框架,并且正在制定人脸识别技术线下支付安全应用规范等生物特征 识别行业标准。公安部社会公共安全应用基础标准化技术委员会发布了行业标 准视频图像分析仪 第4部分:人脸分析技术要求。全国信息安全标准化 技术委员会(SAC/TC260)发布了国家标准信息安全技术 远程人脸识别系统 技术要求等。4.2 标准制修订情况 4.2.1 人脸识别国际标准制修订情况 国际标准化组织和其他国外先进标准组织人脸识别相关的标准统计情况见 表1、表2。表 1 ISO 国际标准统计表 序号标准编号标准名称(英文)标准名称(中文)1 ISO/IEC 197945:2005 Information technology Biometric data interchange formats Part 5:Face image data 信息技术 生物特征识 别数据交换格式 第5部 分:人脸图像数据 2 ISO/IEC 197945:2011 Information technology Biometric data interchange formats Part 5:Face image data 信息技术 生物特征识 别数据交换格式 第5部 分:人脸图像数据 SA C/TC2 8/SC37 37 2020 年人脸识别行业研究报告 序号标准编号标准名称(英文)标准名称(中文)3 ISO/IEC 291095:2019 Information technology Conformance testing methodology for biometric data interchange formats defined in ISO/IEC 19794 Part 5:Face image data 信息技术 ISO/IEC 19794 中定义的用于生物特征 识别数据格式的生物特 征数据交换格式的符合 性测试方法 第5部分:人脸图像数据 4 ISO/IEC TR 297945:2010 Information technology Biometric sample quality Part 5:Face image data 信息技术 生物特征样 本质量 第5部分:人脸 图像数据 5 ISO/IEC 397945:2019 Information technology Extensible biometric data interchange formats Part 5:Face image data 信息技术 可扩展生物 特征识别数据交换格式 第5部分:人脸图像数 据 6 ISO/IEC 247795:2020 Information technology Crossjurisdictional and societal aspects of implementation of biometric technologies Pictograms,icons and symbols for use with biometric systems Part 5:Face applications 信息技术 司法和社会 领域交叉的生物特征识 别技术的实现 生物特 征识别系统中使用的图 标、图示和符号 第4部 分:人脸应用 7ISO/IEC AWI 24357 Performance evaluation of face image quality algorithms 人脸图像质量算法的性 能评价 8ISO/IEC WD 24358 Faceaware capture subsystem specifications 人脸识别采集子系统规 范 表 2 其他国外先进标准组织标准统计表 序号标准组织标准编号标准名称(英文)标准名称(中文)1 IEEE IEEE Std27902020 Biometric Liveness Detection 生物特征识别呈 现攻击检测 2IEEE P2884 Performance Evaluation of Biometric Information:Facial Recognition 生物特征识别信 息性能评估:人 脸识别 SA C/TC2 8/SC37 38 Face Recognition Industry Research Report 4.2.2 人脸识别国内标准制修订情况 我国人脸识别相关的标准情况见表3。表 3 人脸识别国内标准情况 序号标准组织标准编号标准名称标准状态 1 SAC/TC28/SC37 GB/T 33767.52018 信息技术 生物特征样本 质量 第5部分:人脸图 像数据 已发布 2GB/T 26237.52014 信息技术 生物特征识别 数据交换格式 第5分:人脸图像数据 已发布 3GB/T 33842.52018 信息技术 GB/T 26237中 定义的生物特征数据交 换格式的符合性测试方 法 第5部分:人脸图像 数据 已发布 4GB/T 37036.32019 信息技术 移动设备生物 特征识别 第3部分:人 脸 已发布 5SJ/T 116082016人脸识别设备通用规范已发布 620201565T469 信息技术 生物特征识别 人脸识别系统技术要求 草案 720202792T469 信息技术 生物特征识别 人脸识别系统测试方法 草案 8 SAC/TC100/SC2 GA/T 922.22011 安防人脸识别应用系统 第2部分 人脸图像数据 已发布 9GA/T 10932013 出入口控制人脸识别系 统技术要求 已发布 10GA/T 11262013 近红外人脸识别设备技 术要求 已发布 11GA/T 12122014 安防人脸识别应用防假 体攻击测试方法 已发布 12GB/T 314882015 安全防范视频监控人脸 识别系统技术要求 已发布 13GA/T 13442016 安防人脸识别应用视频 人脸图像提取技术要求 已发布 SA C/TC2 8/SC37 39 2020 年人脸识别行业研究报告 序号标准组织标准编号标准名称标准状态 14 SAC/TC100/SC2 GA/T 13242017 安全防范 人脸识别应用 静态人脸图像采集规范 已发布 15GA/T 13252017 安全防范 人脸识别应用 视频图像采集规范 已发布 16GB/T 356782017 公共安全 人脸识别应 用图像技术要求 已发布 17GA/T 13262017 安全防范 人脸识别应用 程序接口规范 已发布 18GA/T 14702018 安全防范 人脸识别应用 分类 已发布 19 SAC/TC 218 GB/T 38427.1 2019?生物特征识别防伪技术 要求 第1部分:人脸识 别 已发布 20 SAC/TC 180 人脸识别技术线下支付 安全应用规范(试行)已发布 21 公安部社 会公共安 全应用基 础标准化 技术委员 会 GA/T 1154.42018 视频图像分析仪 第4部 分:人脸分析技术要求 已发布 22 GA/T 1723.4 2020?居民身份网络认证 认 证服务 第4部分:人脸 图像采集控件技术要求 已发布 23 GA/T 1723.5 2020?居民身份网络认证 认证 服务 第5部分:人脸比 对引擎接口要求 已发布 24 中国电子 工业标准 化技术协 会 T/CESA 11242020 信息安全技术 人脸比对 模型安全技术规范 已发布 25 中国安全 防范产品 行业协会 T/CSPIA 0032020 安全防范人脸抓拍设备 技术要求 已发布 26 广东省产 品认证服 务协会 T/GDC 712020 多维人脸识别测试技术 要求 已发布 27 浙江省安 全技术防 范行业协 会 T/ZJAF 52020 安全防范 人脸数据安全 管理规范 已发布 SA C/TC2 8/SC37 40 Face Recognition Industry Research Report 序号标准组织标准编号标准名称标准状态 28 广东省市 场协会 T/GDMA 272020 面向公共安全的动态人 脸识别系统建设规范 已发布 29T/GDMA 242020 高精度单目人脸静默活 体检测技术指南 已发布 30深圳市SZDB/Z 3162018 动态人脸识别系统前端 建设规范 已发布 4.3 标准化发展建议 4.3.1 加强基础标准制定,促进标准与专利融合 基础标准在人脸识别产品的设计、生产、检测、认证、应用、升级更新等 全生命周期中起到了支撑、优化和规范作用,是提升产品质量、扩大应用场景、促进技术进步和推广创新成果的工具和保障。建议进一步完善人脸识别的技术 标准体系,覆盖算法、平台、技术、产品、应用、安全、测试等标准对象,围 绕行业发展重点,聚焦重点领域的重要产品和个人信息保护的需求,开展人脸 识别产品一致性、可靠性和安全性等基础共性技术标准制修订和试验验证,用 先进标准倒逼产品的转型升级和质量、算法、信息保护能力的提高。标准与专利是支撑行业发展的重要支柱,特别是对于人脸识别这类高新技 术行业,要加强基础标准的制定,促进标准研制与知识产权保护之间的良性互 动,提升标准与专利融合的正效应。4.3.2 加快团体标准制定,促进技术应用和落地 近年来,随着人工智能的发展以及国家经济发展、安全防卫的需要,人脸 识别应用市场不断扩大,相关标准的制定受到广泛关注。鼓励有条件的学会、协会、商会、联合会等社会团体根据技术创新和市场发展的需求,协调相关市 场主体,针对人脸识别前沿技术,自主制定发布团体标准,有效增加人脸识别 SA C/TC2 8/SC37 41 2020 年人脸识别行业研究报告 SA C/TC2 8/SC37 42 Face Recognition Industry Research Report SA C/TC2 8/SC37 43 2020 年人脸识别行业研究报告 SA C/TC2 8/SC37 44 Face Recognition Industry Research Report SA C/TC2 8/SC37 45 2020 年人脸识别行业研究报告 SA C/TC2 8/SC37 46 Face Recognition Industry Research Report SA C/TC2 8/SC37

    浏览量211人已浏览 发布时间2020-12-01 48页 推荐指数推荐指数推荐指数推荐指数推荐指数5星级
  • 中国电子技术标准化研究院:2020年静脉识别产业研究报告(56页).pdf

    SA C/TC2 8/SC37 SA C/TC2 8/SC37 1 2020 年静脉识别行业研究报告 SA C/TC2 8/SC37 2 Vein Recognition Industry Research Report SA C/TC2 8/SC37 3 2020 年静脉识别行业研究报告 SA C/TC2 8/SC37 4 Vein Recognition Industry Research Report SA C/TC2 8/SC37 5 2020 年静脉识别行业研究报告 SA C/TC2 8/SC37 6 Vein Recognition Industry Research Report SA C/TC2 8/SC37 7 2020 年静脉识别行业研究报告 SA C/TC2 8/SC37 8 Vein Recognition Industry Research Report SA C/TC2 8/SC37 9 2020 年静脉识别行业研究报告 SA C/TC2 8/SC37 10 Vein Recognition Industry Research Report SA C/TC2 8/SC37 11 2020 年静脉识别行业研究报告 SA C/TC2 8/SC37 12 Vein Recognition Industry Research Report SA C/TC2 8/SC37 13 2020 年静脉识别行业研究报告 SA C/TC2 8/SC37 14 Vein Recognition Industry Research Report SA C/TC2 8/SC37 15 2020 年静脉识别行业研究报告 SA C/TC2 8/SC37 16 Vein Recognition Industry Research Report SA C/TC2 8/SC37 17 2020 年静脉识别行业研究报告 SA C/TC2 8/SC37 18 Vein Recognition Industry Research Report SA C/TC2 8/SC37 19 2020 年静脉识别行业研究报告 SA C/TC2 8/SC37 20 Vein Recognition Industry Research Report SA C/TC2 8/SC37 21 2020 年静脉识别行业研究报告 SA C/TC2 8/SC37 22 Vein Recognition Industry Research Report SA C/TC2 8/SC37 23 2020 年静脉识别行业研究报告 SA C/TC2 8/SC37 24 Vein Recognition Industry Research Report SA C/TC2 8/SC37 25 2020 年静脉识别行业研究报告 SA C/TC2 8/SC37 26 Vein Recognition Industry Research Report SA C/TC2 8/SC37 27 2020 年静脉识别行业研究报告 SA C/TC2 8/SC37 28 Vein Recognition Industry Research Report SA C/TC2 8/SC37 29 2020 年静脉识别行业研究报告 SA C/TC2 8/SC37 30 Vein Recognition Industry Research Report SA C/TC2 8/SC37 31 2020 年静脉识别行业研究报告 SA C/TC2 8/SC37 32 Vein Recognition Industry Research Report SA C/TC2 8/SC37 33 2020 年静脉识别行业研究报告 SA C/TC2 8/SC37 34 Vein Recognition Industry Research Report SA C/TC2 8/SC37 35 2020 年静脉识别行业研究报告 SA C/TC2 8/SC37 36 Vein Recognition Industry Research Report SA C/TC2 8/SC37 37 2020 年静脉识别行业研究报告 SA C/TC2 8/SC37 38 Vein Recognition Industry Research Report SA C/TC2 8/SC37 39 2020 年静脉识别行业研究报告 表 1 国际标准化组织已发布标准清单 序号标准组织标准号/计划号标准中文名称 1 ISO/IEC JTC1/SC37 ISO/IEC19794-9:2011 信息技术生物特征识别数据交换格式 第9部分:血管图像数据 2 ISO/IEC19794-9:2011/Cor1:2012 信息技术生物特征识别数据交换格式 第5部分:血管图像数据的技术勘误 1 3 ISO/IEC19794-9:2011/Amd1:2013 信息技术生物特征识别数据交换格式 第9部分:血管图像数据补篇1:符 合性测试方法 4 ISO/IEC19794-9:2011/Amd2:2015 信息技术生物特征识别数据交换格 式第5部分:人脸图像数据补篇2:XML编码和缺陷说明 5ISO/IEC19794-15:2017 信息技术生物特征识别数据交换格式 第15部分:掌纹图像数据 6ISO/IEC29109-9:2011 信息技术ISO/IEC19794中定义的生物 特征数据交换格式的符合性测试方法 第9部分:血管图像数据 4.1.2 国内静脉识别标准化组织 目前,国内静脉识别领域的标准化组织主要包括:(1)全国信息技术标准化技术委员会生物特征识别分技术委员会(SAC/TC28/SC37):成立了静脉识别工作组,负责静脉识别的标准化工作,包括静 脉识别相关接口标准、数据交换格式标准、产品通用规范标准、测试方法标准 和行业应用标准等。SAC/TC28/SC37已经发布的静脉识别相关标准见表2 SA C/TC2 8/SC37 40 Vein Recognition Industry Research Report 表 2 SAC/TC28/SC37 已经发布的静脉识别相关标准 序号标准组织标准号/计划号名称 1 SAC/TC28/SC37 GB/T26237.9-2014 信息技术生物特征识别数据交换格式 第9部分:血管图像数据 2GB/T32903-2016 信息技术指静脉识别系统指静脉图像 数据格式 3GB/T33135-2016 信息技术指静脉识别系统指静脉采集 设备通用规范(2)全国安全防范报警系统标准化技术委员会(SAC/TC100)和全国安全 防范报警系统标准化技术委员会人体生物特征识别应用分技术委员会(SAC/TC100/SC2):负责的专业范围为安全防范报警系统中以人体生物特征识别应 用为主要内容的产品、应用系统以及测试检验等领域标准化工作,SAC/TC100 与SAC/TC100/SC2已经发布的静脉识别相关标准情况见表3 表 3 SAC/TC100 与 SAC/TC100/SC2 已经发布的静脉识别相关标准 序号标准组织标准号/计划号名称 1 SAC/TC100 GA/T938-2012 安防指静脉识别应用系统设备通用 技术要求 2GA/T939-2012 安防指静脉识别应用系统算法评测 方法 3GA/T940-2012 安防指静脉识别应用系统图像技术 要求 4GA/T1395-2017安防掌静脉识别应用图像技术要求 5GB/T35676-2017 公共安全指静脉识别应用算法识别 性能评测方法 6GB/T35742-2017 公共安全指静脉识别应用图像技术 要求 7 SAC/TC100/SC2 GA/T1213-2014 安防指静脉识别应用3D数据技术要 求 8GA/T1181-2014安防指静脉识别应用程序接口规范 SA C/TC2 8/SC37 41 2020 年静脉识别行业研究报告 SA C/TC2 8/SC37 42 Vein Recognition Industry Research Report SA C/TC2 8/SC37 43 2020 年静脉识别行业研究报告 SA C/TC2 8/SC37 44 Vein Recognition Industry Research Report SA C/TC2 8/SC37 45 2020 年静脉识别行业研究报告 SA C/TC2 8/SC37

    浏览量126人已浏览 发布时间2020-12-01 48页 推荐指数推荐指数推荐指数推荐指数推荐指数5星级
  • 中国电子技术标准化研究院:2020年虹膜识别行业研究报告(38页).pdf

    SA C/TC2 8/SC37 1 2020 年虹膜识别行业研究报告 SA C/TC2 8/SC37 2 Iris Recognition Industry Research Report SA C/TC2 8/SC37 3 2020 年虹膜识别行业研究报告 SA C/TC2 8/SC37 4 Iris Recognition Industry Research Report SA C/TC2 8/SC37 5 2020 年虹膜识别行业研究报告 SA C/TC2 8/SC37 6 Iris Recognition Industry Research Report SA C/TC2 8/SC37 7 2020 年虹膜识别行业研究报告 SA C/TC2 8/SC37 8 Iris Recognition Industry Research Report SA C/TC2 8/SC37 9 2020 年虹膜识别行业研究报告 SA C/TC2 8/SC37 10 Iris Recognition Industry Research Report SA C/TC2 8/SC37 11 2020 年虹膜识别行业研究报告 SA C/TC2 8/SC37 12 Iris Recognition Industry Research Report SA C/TC2 8/SC37 13 2020 年虹膜识别行业研究报告 SA C/TC2 8/SC37 14 Iris Recognition Industry Research Report SA C/TC2 8/SC37 15 2020 年虹膜识别行业研究报告 SA C/TC2 8/SC37 16 Iris Recognition Industry Research Report SA C/TC2 8/SC37 17 2020 年虹膜识别行业研究报告 SA C/TC2 8/SC37 18 Iris Recognition Industry Research Report SA C/TC2 8/SC37 19 2020 年虹膜识别行业研究报告 SA C/TC2 8/SC37 20 Iris Recognition Industry Research Report SA C/TC2 8/SC37 21 2020 年虹膜识别行业研究报告 SA C/TC2 8/SC37 22 Iris Recognition Industry Research Report 4.1.2 标准情况 表 1 虹膜识别相关国际标准 序号标准编号标准名称标准中文名称是否采标 1 ISO/IEC 19794-6:2005 Information technology Biometric data interchange formats Part 6:Iris image data 信息技术生物 特征识别数据交 换格式 第6部分:虹膜图像数据 2 ISO/IEC 19794-6:2011 Information technology Biometric data interchange formats Part 6:Iris image data 信息技术生物 特征识别数据交 换格式 第6部分:虹膜图像数据 3 ISO/IEC 19794-6:2011/AMD 1:2015 Information technology Biometric data interchange formats Part 6:Iris image data Amendment 1:Conformance testing methodology and clarification of defects 信息技术生物 特征识别数据交 换格式 第6部分:虹膜图像数据 补 篇1:符 合 性 测 试方法和缺陷说 明 4 ISO/IEC 19794-6:2011/COR1:2012 Information technology Biometric data interchange formats Part 6:Iris image data Technical Corrigendum 1 信息技术生物 特征识别数据交 换格式 第6部分:虹膜图像数据 技 术勘误1 5 ISO/IEC 19794-6:2011/AMD 2:2016 Information technology Biometric data interchange formats Part 6:Iris image data Amendment 2:XML encoding and clarification of defects 信息技术生物 特征识别数据交 换格式 第6部分:虹膜图像数据 补 篇2:XML 编 码 和缺陷说明 6 ISO/IEC 29109-6:2011 Information technology Conformance testing methodology for biometric data interchange formats defined in ISO/IEC 19794 Part 6:Iris image data 信 息 技 术ISO/IEC 19794 中定义 的用于生物特征 识别数据格式的 生物特征数据交 换格式的符合性 测试方法第6部 分:虹膜图像数 据 SA C/TC2 8/SC37 23 2020 年虹膜识别行业研究报告 序号标准编号标准名称标准中文名称是否采标 7 ISO/IEC 29794-6:2015 Information technology Biometric sample quality Part 6:Iris image data 信息技术生物 特征样本质量 第 6部 分:虹 膜 图 像数据 8 ISO/IEC FDIS 39794-6 Information technology Extensible biometric data interchange formats Part 6:Iris image data 信息技术可扩 展的生物特征识 别数据交换格式 第6部分:虹膜 图像数据 注:ISO/IEC19794-6:2011 和 ISO/IECFDIS39794-6 国内尚未进行采标。4.2 国内标准化情况 4.2.1 标准化组织情况 国内标准化组织方面,主要是全国信息技术标准化技术委员会生物特征识 别分技术委员会(SAC/TC28/SC37)、全国信息安全标准化技术委员会鉴别与 授权工作组(SAC/TC260/WG4)和全国安全防范报警系统标准化技术委员会人 体生物特征识别应用分技术委员会(SAC/TC100/SC2)负责生物特征识别标准 的制定。其中SAC/TC28/SC37成立了虹膜识别、人脸识别、移动设备等多个 工作组,发布了虹膜识别设备、虹膜样本质量、虹膜图像数据交换格式等标准。SAC/TC260/WG4发布了虹膜识别系统技术要求等标准。SAC/TC100/SC2正在 制订虹膜采集设备通用技术要求、图像技术要求、算法评测方法等公共安全领 域标准。4.2.2 标准情况 SAC/TC28/SC37制定虹膜识别相关的标准共计4项;SAC/TC100/SC2制定 虹膜识别相关的标准共计7项;SAC/TC260/WG4制定虹膜识别标准1项。具 体情况见表2。SA C/TC2 8/SC37 24 Iris Recognition Industry Research Report 表 2虹膜识别相关国家标准及标准计划 序号标准组织标准编号/计划号标准名称标准状态采标情况 1 SAC/TC28/SC37 GB/T 35783-2017 信息技术 虹膜识 别设备通用规范 已发布无 2GB/T 33767.6-2018 信息技术 生物特 征样本质量 第6 部分:虹膜图像 数据 已发布 等同采 用ISO/IEC 29794-6:2015 3GB/T 26237.6-2014 信息技术 生物特 征识别数据交换 格式 第6部分:虹膜图像数据 已发布 非等效采 用ISO/IEC 19794-6:2005 420173821-T-469 信息技术 移动设 备生物特征识别 第4部分:虹膜 报批无 5SAC/TC260GB/T 20979-2019 信息安全技术 虹 膜识别系统技术 要求 已发布无 6 SAC/TC100/SC2 20184357-T-312 公共安全 虹膜识 别应用 算法评测 方法 征求意见无 720184358-T-312 公共安全 虹膜识 别应用 采集设备 通用技术要求 征求意见无 820184356-T-312 公共安全 虹膜识 别应用 图像技术 要求 征求意见无 9GA/T 1208-2014 安防虹膜识别应 用 算法评测方法 已发布无 10GA/T 1286-2015 安防虹膜识别应 用 图像数据交换 格式 已发布无 11GA/T 1429-2017 安全虹膜识别应 用 图像技术要求 已发布无 12GA/T 1486-2018 安全防范虹膜识 别应用 程序接口 规范 已发布无 SA C/TC2 8/SC37 25 2020 年虹膜识别行业研究报告 SA C/TC2 8/SC37 26 Iris Recognition Industry Research Report SA C/TC2 8/SC37 27 2020 年虹膜识别行业研究报告 SA C/TC2 8/SC37

    浏览量125人已浏览 发布时间2020-12-01 29页 推荐指数推荐指数推荐指数推荐指数推荐指数5星级
  • 2020年手机摄像头CIS市场屏下指纹识别行业竞争格局产业研究报告(18页).pptx

    2020年深度行业分析研究报告,目录,5G多摄趋势下,CIS位于黄金赛道,屏下指纹识别持续升级,渗透加快,全球CIS、指纹识别产业链主要企业,摄像头-上下游及市场竞争格局,摄像头上下游,摄像头产业链情况,52%,6%,3%,19%,模组封装20%,摄像头持续升级:智能手机是消费类电 子产品领域最重要的产品,手机拍照质 量是消费者关注的重点,因此光学领域 一直是智能手机创新的重要方向,依次 经历了像素升级、前后置摄像头、多摄 像头、生物识别等发展阶段。,资料来源:旭日大数据,12,摄 像 头 产 业 链 结 构:Sensor、VCM、Lense等构成产业的上游;中游的模组厂 商负责将各种零部件封装成摄像头模组,下游应用于手机、平板、PC等各种电子 产品。,产业链的价值量分布:CIS图像传感器占 据了52%的价值量,是价值量最高的部件;光学镜头和模组的价值量占比分别为19%和20%,两者旗鼓相当,仅次于CIS图像 传感器;音圈马达和红外截止滤光片的 价值量占比分别为6%和3%。,手机摄像头市场将稳步增长:后置双摄 及三摄渗透率持续提升、像素升级以及 3D、体感需求均带动着摄像头数量的增 长。未来手机仍是摄像头市场的主要驱 动力。,CIS像素升级:8MP12MP13MP16MP24MP32MP48MP64MP,智能手机前摄和后摄像素均在不断提升,智能手机像素不断提升:旗舰机种的像素不 断升级,以华为为例,后置摄像头主摄由 2000万逐渐升至4000万甚至5000万。前置摄 像头也逐渐由800万升级至3200万。4800万像素足够满足显示设备4K的显示水平:目前显示设备即使达到4K水平(3840*2160 或4096*2160),也才不到900万像素。4800 万像素即使是在弱光条件下也能输出1200万 像素,因此基于0.8m像素点的4800万像素,已经完全可以满足4K显示水平。4800万像素以上的升级是不是还有意义?我 们认为需求端来看还是非常有意义的。虽然 4800万像素已经可以达到4K显示的要求,但 是高像素的照片提供裁切等后期处理的灵活 性更大。且随着AR/VR在5G时代的应用,摄 像头作为内容生产端,像素提升对于AR/VR 等新型应用,仍是非常必要。,华为历年新机摄像头配置,资料来源:华为官网,13,CIS像素升级:技术端存在瓶颈和限制,0.8m像素点是目前的极限,像素点0.8m目前是保证足够信噪比的极限:主流像素点从1.4m 下降到1.12m,然后又反弹回1.4m。1.4m是在各种光照条件 下均能保证信噪比的极限。目前的极限是0.8m,索尼在进一步开 发0.7m,尚未有实际产品推出。目前的应用极限0.8m仅能保证 在强光下硬件输出4800万像素,弱光条件下输出1200万像素;弱光 条件下输出图像时,相当于是1.6m的像素点。,目前像素点0.8m极限决定了像素不可能无限增加:由于成像质量 是由CIS的像素和像素尺寸共同决定的,而受限于光学尺寸,像素 和像素尺寸都不可能无限增加。目前光学尺寸最大的CIS是华为P40 系列的5000万后摄,光学尺寸达到1/1.28。,像素:像素越多,成像质量越好。智能手机主摄像素不断提升,目前4800万像素逐渐成为旗舰机的主流,且厂商仍在持续6400万 像素新品,像素升级仍在持续。,单个像素的感光面积:单个像素点大小(pixel size)是影响CIS 成像质量的又一关键,pixel size越大,内部光电二极管的面积就 越大,感光效果就越好。,光学尺寸:像素越多,单个像素的感光面积越大,成像效果越好,但这意味着CIS光学尺寸越大。由于智能手机厚度限制,cmos的 光学尺寸不能无限增加。当CIS光学尺寸一定时,像素的提升和,pixel size的提升难以兼容。,难以共赢的像素和像素面积,共同决定了成像质量 均与成像质量成正比,华为历年新机摄像头配置 存在法兰距,光学尺寸增大,摄像头模组 相应变厚,而智能手机一般厚度为8mm,光 学尺寸的限制使得CIS的升级存在瓶颈,资料来源:索尼官网,三星官网,豪威科技官网,14,采用Bayer阵列:以2x2共 四格分散RGB的方式成像,硬件输出1200万像素,将阵列扩大到4x4,但每个2x2 阵列只能识别一种颜色,且只 能一起输出,与1200万像素没 有本质区别,硬件输出1200万像素,软件插 值输出4800万像素(三星GM1),采用Quad Bayer阵列:将阵列扩 大到了4x4,并且以2x2的方式将 RGB相邻排列,在弱光环境下4合1,输出1200万 像素;而在强光环境下硬件输出 4800万像素(索尼IMX586),传统1200万像素传感器,伪4800万像素传感器,真4800万像素传感器,4800万像素是CIS像素升级的重要门槛,像素点尺寸缩小一半,从1.6m缩 小到0.8m,提升工艺难度,Quad Bayer转化为Normal Bayer 需要两种算法:在弱光环境下采用 Binning模式,输出1200万像素;而在强光环境下采用Remosaic算 法,输出4800万像素,从传统1200万像素升级 到4800万的技术挑战,15,CIS结构升级:前照式背照式堆栈背照式,从前照式到背照式,从背照式到堆栈背照式,结构:一般的CMOS都由以下几部分构成:片上透镜、彩色滤光片、金属排线、光电二极管以及基板。前照式:在传统的前照射结构中,构成传感器感光区域的金属线路 和晶体管,被置于在硅基板表面,这就阻碍了片上透镜的采光进程。背照式:背照式结构通过把金属线路和晶体管移至硅基板的另一面,减少了对采光的阻碍,大大增加了进入每个像素的光量。,堆栈式优势:将像素区域与逻辑控制单元从水平放置改为垂直堆叠,像素区域占芯片面积的比例大幅提升。且将像素区域与逻辑控制单元分开制作,可以按需要采用不同的晶 圆制程,便于各自制程的升级。逻辑控制单元的升级有利于提升图 像信号处理能力,实现更多的功能,如硬件HDR,慢动作拍摄等。堆栈式结构:索尼连续推出双堆叠式(CIS ISP)和三堆叠式(CIS ISP DRAM)设计方案。,前照式与背照式原理对比 背照式与堆栈背照式原理对比,三种CIS芯片对比,资料来源:IC insights,16,3D摄像头成为标配:从平面到立体,结构光和TOF打开消费电子市场,3D结构光模组,TOF模组,原理:采用红外光源发射高频光脉冲到物体上,然后接收从物 体反射回去的光脉冲,通过探测光脉冲的飞行(往返)时间来 计算被测物体离相机的距离。结构:泛光照明器 近红外摄像头,其中:泛光照明器的结构为:高功率 VCSEL(用于向物体发射光脉 冲) 扩散片;近红外摄像头的结构为:红外 CMOS传感器 窄带滤光片 聚 焦透镜。,iPhone X 3D结构光模组,原理:采用红外光源,发射出来的光经过一定的编码投影在物体上,这些图案经物体表面反射回来时,随着物体距离的不同会发生不同 的形变。图像传感器将形变后的图案拍下来,基于三角定位法,通 过计算拍下来的图案里的每个像素的变形量,来得到对应的视差,从而进一步测算深度值。结构:点阵投影器 泛光照明器 近红外摄像头,其中:点阵投影器的结构为:高功率VCSEL(用于发射特定波长的近红外 光) WLO lens(晶圆级光学透镜,用于将 VCSEL 输出的光束变成 横截面积较大的、均匀的准直光束,覆盖DOE)和 DOE(光学衍射 元件,用于形成特定编码的光学图案);泛光照明器的结构为:低功率VCSE(用于在光线较暗的环境下补光) 扩散片;近红外摄像头的结构为:红外 CMOS传感器 窄带滤光片 聚焦透镜。OPPO R17 Pro TOF模组,资料来源:SYSTEMPlus,快科技,17,3D摄像头成为标配:3D感测市场快速增长 3D感测成为行业趋势,市场快速增长:2019年,3D感测手机大多集中在旗舰机型,结构光以苹果为代表,自iPhoneX后的机型 都已经搭载结构光功能,而华为搭载ToF的机型数量最多,苹果今年也会搭载TOF机型。Yole的预测数据显示,全球3D成像和 传感器的市场规模在20162022年的CAGR为38,2017年市场规模18.3亿美元,2022年将超过90亿美元。其中,消费电子是 增速最快的应用市场,20162022年的CAGR高达160。结构光和TOF原理及性能对比,资料来源:OPPO官网 18,CIS市场:多摄趋势驱动手机用CIS市场成长 智能手机摄像头搭载量不断提升:手机从双摄向多摄趋势发展,手机摄像头个数的增多,逐步推动了“广角”“长焦”“微距”和“虚化”等3D成像质量的提升,也极大地推动了图像传感器(CIS)市场的爆发。2019年每部手机摄像头的使用量约为3.1个,预计2024年将达到4.3 个。2019年智能手机用CIS的市场规模是137.5亿美元,预计2022年将达到233.5亿美元,复合增长率达到19.3%。,40%,25%,8%,52%,46%,28%,6%,23%,42%,2%,15%,41%,7%,100pP0 %0%,5个以上,5个,4个,3个,2个,2.2,2.4,2.7,3.1,3.9,4.3,4.8,0.00,1.00,2.00,3.00,4.00,5.00,6.00,手机摄像头平均搭载量(个),智能手机摄像头平均搭载量,智能手机摄像头数量分布,全球智能手机用CIS市场规模,数据来源:CINNO Research,19,CIS市场:汽车和安防加码行业成长动力 车用CIS:尽管全球汽车需求疲弱,但汽车智能化推动单车车载摄像头数量提升,高端汽车的各种辅助设备配备的摄像头可多达8个。我们 预测,未来汽车上的摄像头将可能达到12个。且侧视、环视、前视、内视等镜头对CIS性能要求较高,推升2019-2022年车用CIS市场规模 的年复合增长率达到25%。安防CIS:随着智慧城市、雪亮工程建设的大力推进,城市视频监控在不断强化视频监控AI智能应用,视频监控的像素水平、超低照夜视功 能是辅助AI更好发挥效能的前提和基础,CIS在其中扮演着重要角色。我们分析2019-2022年安防CIS市场规模的年复合增长率达到23%。全球CIS市场2019-2022复合增长率达到17%:手机多摄趋势、车用和安防摄像头的需求增长,是CIS市场增长的三大动力,推动行业2019-,2022年市场复合增长率达到17%。,全球车用CIS市场规模,全球安防CIS市场规模,76.2,93.3,106.1,137.5,163.6,211.4,233.5,6.6,12.8,13.2,19.0,25.0,53.80,4.0,9.3,12.0,15.0,17.3,33.1,34.7,36.5 76.92,38.3,40.2,42.2,44.3,0,50,100,150,200,250,300,350,其他,全球CIS市场规模 安防监控车用,手机,单位:亿美元,数据来源:CINNO Research,20,42.4%,21.6%,9.0%,4.6%,6.4%,16.1%,索尼,三星,豪威,安森美,意法半导体,其他,33.0%,27.5%,9.7%,4.9%,7.7%,17.2%,索尼,三星,豪威,安森美,意法半导体,其他,42.4%,19.5%,10.4%,5.9%,5.5%,16.3%,索尼,三星,豪威,安森美,意法半导体,其他,41.2%,18.9%,13.1%,5.3%,3.2%,18.3%,索尼,三星,豪威,安森美,意法半导体,其他,CIS全球竞争格局良好,豪威享受国产化红利 CIS全球竞争格局良好:CIS市场集中在几个大的玩家手中,索尼长期占据着40%以上的市场份额,三星紧随之后。2011年之后豪威一直处 于市场份额下滑的趋势中,主因在高端市场中被索尼、三星超越,在低端市场中又受到Hynix、格科微、思比科、奇景等中韩厂商的蚕食。紧抓4800万像素国产化机遇,豪威市场份额有望上升:2020年预计豪威高阶4800万像素CIS月产能需求,将从5.5万5.8万片上升到7万 片左右,中阶如1200万像素的CIS、低阶如800万及以下像素的CIS,月产能需求都分别约是2万片左右。4800万像素已经成为豪威需求 的主要驱动力。在手机CIS市场中,豪威有望抓住4800万像素国产化机遇,强势收复失去的山河。2016-2019年全球CIS市场竞争格局基本稳定,数据来源:Yole development 供给端扩产有限,行业供不应求:就供给而言,索尼、三星IDM厂商全线满产,索尼产能不够,转而外包给晶圆代工厂商台积电,而豪威、格科微等设计厂商,订单供不应求已经令晶圆代工厂产能十分紧张。2020年我们预计高阶CIS扩产非常有限,主因考虑到高像素CIS芯片 采用BSI工艺,因BSI工艺所需设备定制化程度高,Fab对于扩产意愿较低;现业内多采取改进FSI工艺、或分段加工等方式,当前仍以充 分挖掘现有产能潜力为主,新增产能释放有限,因而预计CIS缺货有望在2020年全年持续。价格上涨验证了行业供不应求的态势,也验证 了行业全球竞争格局良好。CIS价格上涨幅度统计分析,数据来源:半导体行业观察,21,目录,全球CIS、指纹识别产业链主要企业,屏下指纹识别持续升级,渗透加快,5G多摄趋势下,CIS位于黄金赛道,全面屏淘汰电容式指纹识别,屏下指纹识别成为主流 光学屏下指纹识别方案成为手机全面屏时代最重要的生物识别方案:随着手机进入全面屏时代,传统的电容式指纹识别被其他生物识别方 案取代。苹果采用了结构光3D识别,其他手机则转向了屏下指纹识别。主流的屏下指纹识别方案有两种,一是三星手机独家采用的高通超 声波屏下指纹识别方案,三星以外的安卓阵营如:华为,荣耀,小米,OPPO,VIVO,魅族,一加等,则全部采用光学屏下指纹识别方案。三种指纹识别方案对比,资料来源:CINNO Research,22,屏下指纹识别:光学方案进化到第二代时,性价比显著优于超声波方案 第二代光学屏下指纹识别方案大大降低模组成本,但增加了模组厚度:2019年的第二代光学方案使用透镜代替准直层,改善了图像质量 的同时,将整个模组固定在中框上,无需与屏幕贴合,相对于第一代方案大大降低了模组成本。透镜方案的光学指纹凭借较低的成本推 动了整个OLED屏下指纹渗透率在2019年得以快速增长。第三代光学屏下指纹识别方案大大降低模组厚度:2019年底汇顶推出的第三代光学方案采用微透镜,大大降低了模组厚度。通过植入微 透镜,代替传统大透镜,可以大幅压缩光路空间,从而使模组厚度降低至0.30.5mm水平,可以叠放在屏幕和电池中间,提升了设计自 由度的同时,支持手机厂商放入较大的电池。三代光学方案与超声波方案对比,23,资料来源:CINNO Research,55%,25%,12%,8%,汇顶,高通,神盾,思立微,屏下指纹识别加速替代传统电容式指纹识别:2019年 随着旗舰机纷纷采用全面屏,屏下指纹识别全面取代 传统电容式指纹识别。2019年屏下指纹识别传感器出 货达到2亿,相比2018年的2500万个增长了7倍。2020 年屏下指纹识别出货量预计在2.73亿左右,同比上升 36.5%,同比增速受疫情影响。预计随着疫情的影响 逐步消失,屏下指纹识别将回到快速增长的轨道上,预计2019-2022年复合增长率达到64%。光学屏下指纹识别成为中坚力量:2019年屏下指纹芯 片2亿的出货量中,光学屏下指纹识别技术占比约 77.5%,出货1.55亿部,主要品牌有汇顶科技、神盾、思立微;超声波屏下指纹识别技术占比约12.5%,出 货0.45亿部,由高通独家垄断,独家供给三星。光学屏下指纹识别快速渗透,受益于第二代光学方案 大大改善了性价比:透镜方案的光学指纹识别方案,不仅改善了成像质量,提升了识别率(识别率甚至高 过了超声波方案),而且大大降低了成本,推动了整 个OLED屏下指纹渗透率快速增长。汇顶引领光学屏下指纹识别市场:2019年汇顶OLED光 学屏下指纹方案出货约1.1亿片,占光学屏下指纹市 场份额高达71%,占整体屏下指纹市场份额的55%。这 主要受益于汇顶引领了光学屏下指纹识别方案的升级 与创新。,2019年光学屏下指纹识别进入快速渗透期,0.25,0.45,0.48,0.49,0.64,1.55,2.25,4.08,6.86,0.00,1.00,2.00,3.00,4.00,5.00,6.00,7.00,8.00,光学屏下指纹识别,超声波屏下指纹识别,屏下指纹识别:第二代光学方案性价比卓越,2019年加速渗透,屏下指纹识别市场竞争格局,24,资料来源:CINNO Research,资料来源:CINNO Research,18,屏下指纹识别:5G时代,第三代超薄光学方案成为趋势,第二代光学屏方案存在模组较厚的缺点:第二代光学屏下指纹识别方案最大的 升级点在于提升成像质量、并降低价格,从而极大提升了性价比,但由于要给 透镜预留足够的光路空间,模组一般达到3-4mm的厚度,模组较厚导致安装时 需要与电池错位布置,挤占了电池空间。第三代超薄光学方案满足5G手机的需求:随着5G时代到来,5G手机芯片功耗较 高、天线数量增多,并且多摄成为趋势,挤压电池置放空间,使得第二代光学 方案在5G手机中的应用受限。2019年底汇顶推出的第三代光学方案,采用微透 镜代替传统的透镜,从而使模组厚度降低至0.30.5mm水平,可以叠放在屏幕 和电池中间,提升了设计自由度的同时,且支持手机厂商放入较大的电池。5G换机潮,第三代超薄光学方案有望成为主流:目前汇顶的第三代超薄光学方 案已经可以将模组厚度降低到0.3mm以下的水平,并且着力解决成本高的问题。预计随着2020年5G手机上量,超薄屏下指纹方案将有望逐步提高市场渗透率,成为屏下指纹识别的主流方案。,第二代和第三代光学方案对比,小米CC9 Pro是首款使用超薄光学方案的手机,资料来源:小米官网,25,目录,全球CIS、指纹识别产业链主要企业,屏下指纹识别持续升级,渗透加快,5G多摄趋势下,CIS位于黄金赛道,CIS和指纹识别市场,日、韩、中三足鼎立,资料来源:Wind,Bloomberg 备注:基于公司CY2019年数据,神盾,思立微,豪威科技,索尼,三星,0%,10%,20%,30%,40%,50%,全球CIS、指纹识别产业链主要企业 80p%汇顶科技 60%,0,10,20,30,60,70,80,90,毛利率(%),4050 营收规模(亿美金),CIS 指纹识别,大小代表营收规模,20,

    浏览量208人已浏览 发布时间2020-09-27 19页 推荐指数推荐指数推荐指数推荐指数推荐指数5星级
  • 2020中国光学产业链手机多摄生物识别应用场景市场行业研究报告(28页).docx

    2020 年深度行业分析研究报告正文目录核心观点概述3高清、超大广角、高倍变焦的多摄已成手机光学升级首选4高清仍为手机拍照第一要素,前置后置像素升级同步进行5主流品牌在售手机主摄像素超过 40MP,升级趋势仍在继续5图像传感器为镜头模组关键元件,像素升级推动 CMOS 迭代升级6高像素时代多片式镜头为主流,2018 年中国 6P 主摄镜头渗透率为 64.3%7多摄时代镜头升级多元化发展,手机替代单反成为可能9大光圈、广角、变焦兴起,对镜头厂商设计能力提出较高要求9潜望式镜头解决多倍变焦与机身厚度矛盾,华为 P30 Pro 及 OPPO Reno 机型 均已搭载 10AI 算法加盟,打造“逆光也清晰”、“照亮你的美”弥补硬件缺憾 10多摄渗透率提升全面推动光学产业链增长,安卓系增长更胜一筹11多摄模组组装难度提升,技术优势及创新能力成制胜关键11安卓系市占率提升且多摄升级节奏快,供应链高端多摄模组厂出货创新高12多摄渗透率提升驱动下,镜头及上游元件需求全面放量14生物识别潮流兴起,应用场景拓展带来全新机遇16全面屏普及催生全新手机解锁方案,屏下光学指纹与人脸识别同步发展162017 年 VIVO 首发光学屏下指纹解锁方案,低成本或加速终端渗透16苹果首推 3D 面部识别方案,开启手机生物识别新潮流18生物识别拓展产业链新机遇,3D 感知带来全新需求193D 人脸识别更精准捕捉生物信息,生物识别场景不断丰富193D 感知兴起为产业链带来全新增量,模组及上游元件需求同步提升21智能驾驶兴起,“全方位 高标准”车载摄像头市场方兴未艾 23驾驶智能化提升,车载镜头从后视向侧视、环视、前视、内视多方位拓展232023 年全球单车镜头数将达 3 颗,高规格车载镜头渗透空间更大235G 大幕拉开,VR/AR 实景交互打开光学新场景25VR 发展进入新阶段,菲涅尔透镜打造广 FOV 轻型 VR25光学系统为 AR 成像关键,光波导技术进步将推动 AR 向 C 端普及26万物互联时代,3D 感知将重构 VR/AR 实景交互想象空间27核心观点概述从 2000 年夏普推出全球首款搭载后置 11 万像素摄像头的拍照手机 J-SH04 开始,到如今 移动互联网时代照片实时分享、短视频、直播等应用兴起,光学应用在智能手机中扮演着 越发重要的角色,成为消费者选择手机的重要参考指标。“柔光双摄,照亮你的美”、“逆 光也清晰”等围绕光学成像的标语成为手机品牌的重要卖点,光学升级也因此成为智能手 机厂商重点关注的创新领域。在手机相机升级替代单反的过程中,像素升级是消费者及手机厂商关注的首要参数。如今,主流品牌在售手机包括华为 Mate30 系列、OPPO Reno 10 x 等机型主摄像素超过 40MP,且升级趋势仍在继续,19 年 11 月 5 日小米发布的 CC9 Pro 后置主摄像素高达 1 亿。像 素的升级直接推动了图像传感器由 CCD 向 CMOS 升级迭代,同时多片式镜头也已成为主 流。根据华经产业研究院数据,2018 年中国智能手机出货中有 35.6%主摄像头为五片式 5P 镜头,64.3%主摄像头为六片式 6P 镜头,而小米最新发布的 CC9 Pro 后置主摄则采 用了 7P 镜头(尊享版 8P 镜头)。为了进一步丰富智能手机拍照功能、完善其对单反替代的使命,大光圈、广角、变焦等方案兴起,同时具备高清、广角、变焦、大光圈等镜头的多摄模组成为各品牌旗舰机的标配 方案。尽管全球智能手机渗透率趋于饱和、用户换机周期拉长,但多摄模组升级以及多摄 渗透率提升趋势仍在继续。我们以 2019 年全球 13.7 亿部智能手机出货为基数,测算2019-2021 年全球三摄及以上机型渗透率从 15%提升至 50%将带来 14.4 亿颗新增摄像头 需求(2018 年全球出货 41.5 亿颗),加之多摄模组镜头持续向高清、广角、变焦等方向 升级,手机光学产业链将迎来量价齐升的增长机遇,包括上游光学元件(CIS、马达等)、镜头及模组在内的厂商将全线受益。除拍照功能升级外,全面屏时代屏下光学指纹和 3D 面部识别解锁先后在安卓系和苹果机 型中应用,生物识别潮流由此兴起,为手机光学产业链注入新的增长动力。与此同时,OPPO、华为等手机品牌也开始在后置模组中搭载 TOF 镜头,用于增强拍摄效果,并不断向 3D 体感游戏、3D 试装、AR 游戏、全息影像交互等应用延伸。同时,随着手机端 3D 感知渗透率提升,应用场景向汽车(智能驾驶)、VR/AR(3D 实景交互)、工业控制(工 业流程虚拟 3D 可视化)、安防(3D 人脸识别与检测)、医疗(VR 虚拟教学、案例模拟)、家装(设计方案 3D 可视化)等领域拓展,3D 模组以及上游 Vcsel 激光器、WLO 准直镜 头、窄带滤光片、DOE、Diffuser 将成为光学产业链全新增量。随着 5G 商用启动、“电子 ”时代来临,非电子产品的电子化、简单电子产品的智能化成 为物联网时代移动终端的发展方向。汽车作为现代最为重要的交通工具,驾驶智能化的需 求不断提升,车载镜头开始从后视向侧视、前视、环视、内视等高规格品类拓展,Yole预计 2023 年全球单车平均车载镜头数量将从 18 年的 1.7 颗增加至 3 颗。另一方面,随 着 5G 大幕拉开,VR/AR 产业生态在硬件技术设备优化、高速网络环境支持、以及应用场 景拓展推动下逐步成熟,基于 3D 感知的实景交互将进一步提升 VR/AR 用户体验、升华社交属性。我们认为 VR/AR 有望成为 5G 时代继 TWS、智能手表之后的主流可穿戴设备,与此相关的菲涅尔透镜、光波导以及 3D 感知也将成为 5G 时代光学产业链的新天地。考虑到移动互联网时代电子设备信息输入及输出对光学应用的依赖度不断提升,我们认为 以手机光学产业链为基础的光学创新,叠加以汽车、VR/AR、工控、安防、医疗等多场景应用拓展的双轮驱动,将为光学产业链带来持续的量价齐升增长机遇,而具备技术优势及 创新能力的企业将成为优长光学赛道中的主要赢家,推荐水晶光电(光学元件)、歌尔股 份(光学元件),建议关注汇顶科技(指纹识别)、韦尔股份(CIS)、欧菲光(镜头及模组)、联创电子(镜头及模组)。高清、超大广角、高倍变焦的多摄已成手机光学升级首选光学升级成为智能手机厂商重点关注的创新领域。从 2000 年夏普推出全球首款搭载后置11 万像素摄像头的拍照手机 J-SH04 开始,到 2007 年三星推出全球首款后置双摄镜头手 机 SCH-B710,2012 年 OPPO 推出全球首款具备美颜拍照功能的 U701,手机逐步成为 相机、单反的替代品。移动互联网时代,照片实时分享、短视频、直播等应用兴起使得消 费者对手机拍照性能的要求进一步提升,光学升级也由此成为智能手机厂商重点关注的创 新领域。2019 年,OPPO 推出可实现 10 倍光学变焦的 Reno、华为推出搭载徕卡四摄的 Mate 30 Pro、小米推出后置五摄且主摄像素高达 108MP 的 CC9 等,我们看到智能手机光学创新已从单 一的像素升级向多元化多摄方案升级。根据 DxoMark 对智能手机拍照性能测评结果,2019 年推出的拍照性能前十名智能手机前置像素均已超过 10MP,后置个数均超过 3 个,国产 品牌主摄像素超过 40MP。随着手机光学升级继续,我们认为“广角 超广角 长焦”三摄或“广角 超广角 微距 景深”四摄已成为智能手机多摄的主流方案,而主摄像素升级、辐摄功能多元化、多摄模组升级、以及光学创新不断从高端机型向中低端机型渗透都将为 光学产业链带来持续的增量。图表1:智能手机光学升级演进图资料来源:中光村在线,科学技术宅,华为官网,研究所图表2:2019 智能手机拍照功能 DxoMark 测评结果前十名参数对比分数品牌型号前置像素后置摄像头个数后置模组参数121华为Mate 30 Pro32MP440MP 超广角主摄 40MP 超广角 8MP 长焦 TOF 深感摄像头121小米Mi CC9 Pro32MP5108MP 超高清主摄 12MP 长焦镜头 20MP 超广角 12MP 人像镜头 8MP 超长焦镜头117苹果iPhone 11 Pro Max12MP312MP 广角主摄 12MP 长焦镜头 12MP 超广角镜头117三星Galaxy Note 10 5G10MP412MP 广角主摄 12MP 长焦镜头 16MP 超广角镜头 3D 景深摄像头117三星Galaxy Note 10 10MP4同上116华为P30 Pro32MP440MP 超感光主摄 20MP 超广角镜头 8MP 长焦镜头 TOF 镜头,10 倍混合变焦116OPPOReno 10 x Zoom16MP348MP 主摄 8MP 超广角主摄 13MP,5 倍光学变焦116三星Galaxy S10 5G10MP 8MP412MP 主摄 16MP 超广角镜头 12MP 长焦镜头 ToF 景深摄像头114一加7 Pro16MP348MP 主摄 16MP 超广角镜头 8MP 长焦镜头113荣耀20 Pro32MP448MP 主摄 16MP 超广角镜头 8MP 长焦镜头 2MP 微距资料来源:DXOMARK,研究所高清仍为手机拍照第一要素,前置后置像素升级同步进行主流品牌在售手机主摄像素超过 40MP,升级趋势仍在继续 像素是数码影像的基本单元,也是影响成像效果真实度的重要参数。像素越大,照片分辨 率就越大,即镜头对于画面的解析能力就越强。在手机相机升级替代单反的过程中,像素升级便成为消费者及手机厂商关注的重要参数。华为 2013 年 3 月推出的首款 Mate 手机,前置像素 100 万(1MP)、后置像素 800 万(8MP);至 2019 年 9 月,华为 Mate 30 Pro 已达到前置 32MP,后置广角双 40MP 长焦 8MP ToF 四摄镜头。根据 DxoMark 对智能手机拍照性能测评结果,2019 年推出的拍照性能前十名 智能手机中,除 iPhone 11 Pro Max 和三星三款 Galaxy 系列外,其他机型后置主摄像素 已超过 4000 万像素(40MP),前摄像素也普遍超过 10MP。小米推出的 CC9 Pro 后置主 摄像素更是达到 108MP,前置像素达到 32MP。由此可见,像素升级仍然是手机厂商镜头 升级的重要突破方向。图表3:华为 Mate 及 P 系列前置及后置像素升级路径机型推出时间后置像素前置像素华为 Mate2013 年 3 月8MP1MP华为 P62013 年 6 月8MP5MP华为 P72014 年 5 月13MP8MP华为 Mate 72014 年 9 月13MP5MP华为 P82015 年 4 月13MP8MP华为 Mate 82015 年 11 月16MP8MP华为 P92016 年 4 月12MP 12MP8MP华为 Mate 92016 年 11 月20MP 12MP8MP华为 P102017 年 2 月20MP 12MP8MP华为 Mate 102017 年 10 月20MP 12MP8MP华为 P202018 年 3 月20MP 12MP24MP华为 Mate 202018 年 10 月16MP 12MP 8MP24MP华为 P302019 年 4 月40MP 16MP 8MP32MP华为 P30 Pro2019 年 4 月40MP 20MP 8MP ToF32MP华为 Mate 302019 年 9 月40MP 16MP 8MP24MP华为 Mate 30 Pro2019 年 9 月40MP 40MP 8MP ToF32MP资料来源:华为官网,研究所2017 年中高端机型 13MP 及以上像素渗透率超过 51%。根据 Yole 及观研天下数据,2017 年 200 美元以上价位的机型均已采用 8MP 以上的镜头,13MP 以上出货占比达到 51%,8MP 以上出货占比达到 78%;而从 CMOS 图像传感器出货分布来看,5MP 及以下的手 持设备 CMOS 图像传感器出货量逐年走低,至 2018 年已有超过一半的手持设备像素超过 13MP,且随着智能手机像素不断升级,Yole 预计 2019 年 13MP 及以上手持设备 CMOS 图像传感器出货量将进一步提升。图表4:不同价位手机摄像头像素分布(2017 年)图表5:手持设备 CMOS 图像传感器出货量按像素分布20MP5-8P(亿个)6013MP508040 0%$60002013201420152016201720182019E资料来源:Yole,观研天下,研究所资料来源:Yole,观研天下,研究所M100020镜头厂商 10MP 以上镜头出货占比持续提升。根据舜宇光学半年报披露,1H14 公司 10MP以上镜头模组出货占模组总出货量比例为 13%,1H18 最高达到 78%,1H19 环比小幅回 落但同比仍有提升。根据丘钛科技月度公告数据,丘钛科技自 2018 年初起镜头模组出货 量除季节因素波动外总体呈现持续增长态势,2019 年丘钛镜头模组总出货量中 10MP 以 上模组出货占比同比提升 10pct 至 54%。10MP以上占比(右轴)图表6:1H19 舜宇 10MP 以上镜头模组出货占比达到 65%图表7:2019 年丘钛 10MP 以上模组出货占比提升 10pct 至 54%舜宇光学镜头模组出货(百万件)300 10MP以上占比(右轴)丘钛科技镜头模组出货(百万件)100P70%06050 060040040 30020P2010%1H142H141H152H151H162H161H172H171H182H181H19Jan-18 Mar-18 May-18 Jul-18 Sep-18 Nov-18 Jan-19 Mar-19 May-19 Jul-19 Sep-19 Nov-19资料来源:舜宇光学半年报,研究所资料来源:丘钛科技公告,研究所00%图像传感器为镜头模组关键元件,像素升级推动 CMOS 迭代升级从镜头成像原理来说,手机摄像头是通过镜头捕捉画面并在图像传感器上产生可移动电荷,然后经由图像传感器将电信号转化为数字信号、DSP 对数字信号处理后,在屏幕上呈现图 像。因此,除镜头捕捉画面能力强弱外,图像传感器也是影响摄像成像效果的关键因素。图表8:手机摄像头成像原理资料来源:手机资讯技术网,研究所根据前瞻产业研究院估算,2018 年单颗摄像头成本构成中,约 52%来自于图像传感器、20%来自于镜头、19%来自于模组封装,仅 6%和 3%来自于音圈马达和红外滤光片。目 前,图像传感器可分为 CCD(电荷耦合器件)传感器和 CMOS(互补金属氧化物半导体)传感器(CIS)两大类。CCD 图像传感器是一种用于捕捉图像的感光半导体芯片,其所捕 捉到的画面中每个像素的电荷数据会依次传送到下一个像素中,由最底端输出后经传感器 边缘放大后输出。CIS 是将图像信息经光电转换后产生电流或电压信号,在 CMOS 晶体 管开关阵列中直接读取,无需逐行读取,因此在灵活性和集成度上显著优于 CCD。图像传感器尺寸是影响感光元件成像效果的关键因素,即传感器尺寸越大,感光面积越大,成像效果越好。尽管 CCD 在灵敏度、分辨率和噪音控制等方面表现均好于 CIS,但随着 CMOS 工艺发展以及手机像素升级,CIS 低功耗、高集成度的特性使得其能够在实现高像同时有效控制成本,因而成为高像素时代手机图像传感器的首选方案。图表9:摄像头元件拆分图表10:摄像头元件成本构成(2018 年)模组封装19%镜头20%图像传感器音圈马达52%红外滤光6%片3%资料来源:Ofweek 工控网,研究所资料来源:前瞻产业研究院,研究所图表11:图像传感器 CCD 与 CMOS 性能对比CCDCMOS工作原理电荷信号先传送,后放大,再 A/D电荷信号先放大,后 A/D,再传送成像质量灵敏度好,分辨率好,噪音小灵敏度低,噪音明显(高感光度下表现好)制造工艺复杂相对简单、成本合格率高制造成本高低耗电量高(驱动电压高)低(高整合度、体积小)处理速度慢快资料来源:智研咨询,研究所根据 Yole 数据,2018 年全球 CIS 市场中索尼独占 50%份额,三星和豪威(被韦尔收购)分别以 21%和 12%市占率位居二三。为匹配手机像素升级需求,作为全球 CIS 龙头,索 尼于 2018 年率先推出 48MP 的 CIS IMX586,单位像素仅 0.8m,并且使用了“Quad Bayer”4 像素同色绿色器阵列,可在夜拍模式下将单个像素调整为 1.6m,由此优化夜间拍摄效 果。随后,三星和豪威也先后推出了 48MP 的 CMOS 图像传感器 GM1 和 O48B。图表12:2018 年全球 CMOS 图像传感器市场份额图表13:索尼 2018 年发布 CMOS 图像传感器 IMX586SK海力士安森美其他 豪威3%6%三星21%索尼50%资料来源:Yole,研究所资料来源:天极网,研究所高像素时代多片式镜头为主流,2018 年中国 6P 主摄镜头渗透率为 64.3%在像素升级的过程中,为了进一步优化成像效果,镜头厂商往往选择多片式镜头,因为增 加镜片能够增强镜头汇聚光线的能力从而优化镜头解析力与对比度,同时改善暗态出现眩 光的现象。此外,多镜片还能够实现大光圈、变焦等不同功能。根据华经产业研究院数据,2018 年中国智能手机出货中有 35.6%主摄像头为五片式 5P 镜头,64.3%主摄像头为六片 式 6P 镜头,还有 0.1%主摄像头为七片式 7P 镜头。图表14:舜宇光学光学六片式(6P)镜头图表15:2018 年中国智能手机主摄像头镜片数以六片式为主7P镜头0.1%6P镜头64.3%5P镜头35.6%资料来源:舜宇光学官网,研究所资料来源:华经产业研究院,研究所镜片数增加导致光线损耗、镜头体积增大,且对光学设计提出更高要求。小米于 19 年 11月 5 日发布的 CC9 Pro 采用后置五摄方案,其主摄采用了 7P 镜头(尊享版 8P 镜头)实 现 1 亿像素,1/1.33 英寸超大感光元件和 f1.7 大光圈。镜头片数增加直接导致镜头体积增 加。根据驱动中国不同像素镜头体积对比,我们测算 108MP 像素镜头垂直投影面积约为 2.9 cm2,远高于 13MP 像素镜头垂直投影面积(约 0.7 cm2)。尽管像素升级过程中仍需 要镜头片数增加以优化成像效果,但我们认为镜片厂商及手机品牌商也需要权衡镜片数量 增加以提升像素和多镜片导致的光线损耗、设计难度增加、以及镜头体积轻薄化之间的矛 盾。图表16:1 亿像素镜头体积显著高于 13MP 像素镜头体积资料来源:驱动中国,研究所玻塑混合镜头解决镜头性能瓶颈,但量产难度高尚未普及。目前常见的镜片材质为玻璃和塑料两类,尽管玻璃相比于塑料具有更高的折射率和更好的透光性,但受制于重量、生产 良率、成本等因素,玻璃镜头较难在手机领域广泛应用,因此目前常见的手机镜头为多片 式塑料镜头,而我们通常所说的 6P 镜头也多指六片式塑料镜头。2017 年,舜宇实现全球 首款玻塑混合镜头量产。相比之下,玻塑混合镜头能够改善多片式塑料镜头所导致的光线 损耗、画面失真等问题,但现阶段其生产成本和量产难度均高于塑料镜片,因此在智能手 机领域的应用较为有限。图表17:不同材质镜片参数对比特点塑料镜片玻璃镜片玻塑混合镜片工艺难度低高居中量产难度高低居中生产成本低高居中热膨胀系数高低居中重量轻重居中透光率89%-92%介于两者之间主要下游应用手机高端安防、监控、车载手机、高端安防、监控、车载代表企业大立光、玉晶光、舜宇光学腾龙、富士能、福建福光、舜宇光学、凤凰光学,研究所多摄时代镜头升级多元化发展,手机替代单反成为可能大光圈、广角、变焦兴起,对镜头厂商设计能力提出较高要求2007 年,三星发布全球首款后置双摄镜头手机 SCH-B710,但直至 2016 年华为推出首款 搭载徕卡双摄镜头模组的 P9 机型起,智能手机正式开启双摄时代,而 2018 年华为推出 的全球首款后置三摄手机 P20 Pro,则进一步将智能手机推向多摄时代。随着后置摄像头 数量增加,手机拍照功能也从高清向大光圈、长焦、广角等方向丰富,使得手机替代单反成为可能。但考虑到大光圈、广角镜头及长焦镜头在成像过程中受光线折射影响易出现畸 变现象,镜头厂商在此类镜头的光学设计及调配组装能力也面临较大挑战。光圈是镜头控制感光元件进光量的装置。在感光元件大小相同、镜头焦距不变的情况下,镜头通光直径越小(F/通光直径),镜头光圈越大,镜头进光量就越大。在此情况下,大 光圈能够实现背景虚化,同时提升快门速度有效防抖以捕捉动态画面。为了优化手机拍照功能使其接近单反使用体验,如今大光圈已成为主流品牌旗舰机摄像模组标配。2019 年 6 月推出的荣耀 20 Pro 主摄光圈达到 F/1.4,成为目前光圈最大的机型。然而,光圈变大会 导致光线在折射过程中色差、色散增加,因此镜头厂商所面临的光学设计难度(校正像差)和装配调试难度(确保同轴组立精确度)也随之增加。图表18:镜头通光直径越小,进光量越大,成像效果越好资料来源:华强电子网,研究所广角镜头可通过较小的焦距实现更大的视角范围,目前主流智能手机品牌旗舰机型已有部分采用了广角镜头(焦距 24-35mm,视角范围 60-84 度)和超大广角镜头(焦距 14-20mm,视角范围 94-118 度)。广角镜头的设计难度在于受镜片折射影响画面边缘会产生畸变,因 而需要通过更为精细镜片组合优化光学设计、采用高质量光学玻璃生产镜片,以及通过后 期算法对镜片成像效果进行处理。图表19:标准焦距呈现效果图表20:广角镜头导致两侧画面畸变资料来源:中关村在线,研究所资料来源:中关村在线,研究所长焦镜头是指焦距 85mm 的镜头,视角范围小,可用于拍摄距离较远的物体。相比于数码 变焦仅通过扩大固定区域内单个像素点面积拍摄远景,长焦镜头能够在不损失画质的情况 下实现远景更为真实的呈现。例如华为 Mate 20 Pro 后置采用了徕卡三摄镜头,包括 40MP广角镜头(焦距 27mm)、20MP(焦距 16mm)超广角镜头和 8MP 长焦镜头,其变焦模5 倍混合变焦和 10 倍数字变焦。潜望式镜头解决多倍变焦与机身厚度矛盾,华为 P30 Pro 及 OPPO Reno 机型均已搭载在智能手机不断向着机身轻薄化趋势发展之际,手机长焦镜头变焦倍数增加所带来的模组 厚度增加将导致高倍数的变焦模组很难嵌入手机之中;而潜望式摄像头能够在满足变焦需 求的基础上,通过将镜头模组与机身平行设计从而避免因变焦镜头带来的机身增厚情况。OPPO 于 17 年 2 月发布了其独创的通过内置光学棱镜实现的 5 倍无损变焦技术。微型棱 镜是手机能够实现高倍数光学变焦的重要配件,目前华为的旗舰款手机 P30 Pro 已搭载 潜望式摄像头,OPPO 也于 19 年 4 月发布了可实现 10 倍混合光学变焦技术的 Reno 系 列(48MP 主摄镜头 8MP 超广角镜头 13MP 潜望式长焦镜头)。图表21:OPPO 潜望式摄像头通过内置微型棱镜实现无损变焦资料来源:OPPO 官网,研究所AI 算法加盟,打造“逆光也清晰”、“照亮你的美”弥补硬件缺憾在智能手机光学升级过程中,除光学元器件本身性能、数量提升之外,后期光学成像效果 也成为手机厂商新的突破方向。随着搭载全球首颗负责 AI 计算的 NPU 智能手机处理芯片 的华为 Mate 10、以及搭载引入神经网络引擎的 A11 芯片的 iPhone8/8Plus/X 推出,AI 拍照成为 2018 年以来智能手机摄影新风潮。例如,华为 P30 Pro 已将 AI 技术应用在夜景 拍摄、HDR 逆光美艳、背景虚化、场景识别、智能防抖等场景。图表22:主流手机品牌热销型号中各场景 AI 算法应用品牌及型号/应用场景低光夜景拍摄HDR 逆光美颜算法人像分割背景虚化智能场景识别AIS 智能防抖华为 P30 Pro(超级夜景)(AI 面部打光)(AI 智能处理)(AI 摄影大师)苹果 iPhone 11 Pro(智能夜间模式)(面部识别提亮)(A13 仿生实时处理)三星 Galaxy Note 10(智能自调光圈)(动态色调映射)(视频背景虚化)(AI 智能场景识别)Google Pixel 4 XL(智能夜视模式)(AI HDR 处理)(AI 前后景切割)(智能防动态模糊)VIVO NEX 3(超级夜景)(自拍美颜)(镜头组合算法优化)资料来源:HUAWEI、Apple、三星、谷歌、VIVO、小米手机官方网站,研究所AI 算法的引入,首要解决的则是传统智能手机在夜间低光场景下的拍摄限制。以 iPhone 11/11 Pro 为例,手机识别夜景场景后拍摄时可一次性拍摄多张照片,然后运用内置 AI 算 法的相机软件,在其 A13 仿生芯片的支持下,通过协调多张照片清晰部分进行拼和来修正 抖动的画面,然后以算法自动调节整张照片对比度,使得画面中所有元素保持整体色彩平 衡,并按照自然真实的视觉色彩对画面进行颜色精调,最后通过 AI 算法智能处理,消除 图片中的噪点,并补充细节,生成清晰的夜拍照片。谷歌于 2017 年推出的 Pixel 2,虽为单摄配置,但通过在摄像头中加入专门用于图像处理 协处理器(IPU)及各类传感器,该摄像头能够主动感知空间深度并通过 AI 算法调整曝光 时间,智能处理并最终生成清晰自然的夜景照片。根据脚本之家讯,谷歌于 2019 年 10 月 15 日最新推出的 Pixel 4XL,已能够在算法支持下直接拍摄清晰星空银河。此外,在背景虚化、HDR 及逆光拍摄面部提亮处理上,AI 技术还解决了传统多摄模组在 背景虚化与拍摄主体分割处理不自然、缺乏细节处理的问题。以华为 P30 Pro 为例,搭载 新一代 NPU 麒麟 990 5G 芯片引入 AI 分割算法后,后置多摄模组能够在优化背景虚化细 节的同时,还能够增强实时视频的背景虚化渲染能力,而前置摄像头则通过采用 AI HDR 人像分割算法,使得镜头捕捉画面中的人、景分离,逆光条件下也能最大程度保证拍摄主 体尤其是面部明亮自然,背景清晰细腻。图表23:iPhone 11 Pro 夜景模式启用效果图表24:HUAWEI P30 Pro AI 背景虚化及逆光提亮效果资料来源:Apple iPhone 官方网站,研究所资料来源:HUAWEI 官方网站,研究所多摄渗透率提升全面推动光学产业链增长,安卓系增长更胜一筹多摄模组组装难度提升,技术优势及创新能力成制胜关键在双摄问世之前,单颗摄像头模组(CCM)封装技术门槛较低,因此拍照手机盛行便吸引 了大量供应商涌入 CCM 封装行业。但随着 CCM 向多摄升级,具备量产能力的模组厂商 数量逐渐减少,因为多摄模组对模组精度、组装设备和技术有着更高要求,而模组厂商在 进行组装时需要考虑镜头增加对模组体积的影响,以及镜头增加带来的成像系统校准难度 增加的问题,组装难度及设备投入也会因此大幅增加。根据 ittbank 不完全统计,全球单 摄模组供应商超过 28 个,而双摄模组供应商为 10 个,三摄模组供应商仅剩 3 个。多摄升级及渗透率提升为手机镜头行业带来可观的增量需求,但对模组厂商而言这既是机 遇又是挑战。考虑到技术研发难度提升,模组厂在多摄模组生产初期会因良率爬坡面临较 大的利润压力,且随着模组生产进入成熟期,模组厂商又需要面临来自下游客户的价格压 力。在此情况下,保证技术优势与创新能力将成为模组厂商同业竞争的制胜关键。图表25:摄像头模组供应商一览分类供应商单摄模组欧菲光、舜宇光学、丘钛科技、LG、三星电机、夏普、信利国际、致伸科技、Partron、Cowell、Cammsys、美细耐斯、索尼、光阵 光电、意法半导体、合力泰、三赢兴、深圳四季春、惠州桑莱士、深圳金康光电、深圳凯木金、深圳成像通、深圳博立信、深圳科特 通、百辰光电、广州大凌实业、深圳亿利威、康隆光电等双摄模组LG、舜宇光学、欧菲光、三星电机、丘钛科技、光宝集团、夏普、致伸科技、信利国际、大凌三摄模组欧菲光、舜宇光学、光宝集团资料来源:ittbank,研究所常见的图像传感器封装技术包括芯片尺寸封装 CSP、板上芯片封装 COB 和倒装芯片封装 FC 三类;其中,CSP 多用于低像素(5M 以下)传感器,通过 SMT 产线组装即可完成,COB/FC 适用于中高级像素(5M 以上)传感器,能够实现较高的图像质量与致密精确性,模组厚度相对较薄,但产线成本也更高。为满足手机像素升级需求,目前主流品牌摄像头 模组供应商如舜宇、欧菲光、丘钛、LG、夏普、索尼等均采用了 COB/FC 的封装技术。CIS 芯片封装完成后,模组厂需根据设备调节参数移动零部件,将图像传感器与马达、镜 头、线路板、镜座等组装起来;但随着像素提升、镜头个数增加,模组零部件间叠加公差 加大,难以保证镜头与传感器光轴同心度和垂直度,将导致成像画面周边出现暗角、模糊 等现象,因此需要 AA(光学主动对准)设备进行主动式调焦。根据立鼎产业研究院数据,AA 设备单价约 200-300 万元,目前一线模组厂多采用进口设备,国内模组厂如舜宇也在 进行自主研发。AA 设备的高成本也成为中小型模组厂涉足多摄模组的资本障碍。除自主研发 AA 设备外,舜宇还自主研发了 MOB(板上封装)和 MOC(芯片上封装)新 型封装技术。MOB/MOC 封装可用于大光圈模组封装,能够进一步压缩模组尺寸,更符合 全面屏窄边框的设置,并且此类技术能够优化模组结构性能,无需再通过 AA 工序进行校 准。根据旭日大数据,舜宇所研发的 MOB、MOC 技术相较于 COB 技术能够将模组基座 面积缩减 11.4%、22.2%。根据公司官网信息,欧菲光也于 2017 年 6 月自主研发了 CMP 小型化封装工艺,并于 2018 年第三季度正式量产。图表26:CCM 主流封装技术对比CSP(芯片尺寸封装)COB(板上芯片封装)FC(倒装芯片封装)封装特点封装尺寸和芯片核心尺寸基本相同,由玻璃覆盖,裸片封装,需要无尘环境,可将感光芯片、ISP通过将传感器倒贴在电路板上,然后盖分为灌胶类、荧光粉膜类等及软板整合在一起上镜头进行封装模组厚度厚相对较薄较 COB 薄约 1 毫米致密精确性低高高图像质量相对低相对高相对高产品良率高于 96%约 96%约 96%生产线成本相对低,仅需 SMT 生产线相对高,约 1000 万元较 COB 高约 30%-50%应用厂商中小模组厂商舜宇、欧菲光、丘钛等欧菲光、索尼、LG、夏普、高伟电子资料来源:立鼎产业研究院,研究所安卓系市占率提升且多摄升级节奏快,供应链高端多摄模组厂出货创新高 作为全球首家发布后置徕卡双摄机型的品牌,华为在双摄机型的普及速度上显著领先其他 厂商。根据旭日大数据,2017 年华为双摄渗透率已达到 52.7%,Vivo、苹果、OPPO、小米双摄渗透率也已经达到 41.9%、35.0%、22.6%、16.8%。随着各品牌多摄渗透率进 一步提升,根据中国信通院数据,2018 年中国在售手机中后置双摄机型占比已达到 64%,前置双摄渗透率也已达到 7%。根据前瞻产业研究院数据,2018 年全球平均每部手机搭载 摄像头个数已达到 2.84 个。图表27:2017 年华为手机双摄渗透率已达到 52.7%图表28:2018 年全球平均每部手机摄像头已达 2.84 颗三星LG小米OPPO苹果 Vivo 华为2017年品牌手机双摄渗透率0 0P%3.02.82.62.42.22.0平均每部手机摄像头颗数(个)20142015201620172018

    浏览量57人已浏览 发布时间2020-09-27 28页 推荐指数推荐指数推荐指数推荐指数推荐指数5星级
前往
会员购买
客服

专属顾问

商务合作

机构入驻、侵权投诉、商务合作

服务号

三个皮匠报告官方公众号

回到顶部