PGµç¾º



±¾Õ¾µãʹÓÃCookies£¬¼ÌÐøä¯ÀÀ±íʾÄúͬÒâÎÒÃÇʹÓÃCookies¡£ CookiesºÍÒþ˽Õþ²ß>

¼òÌåÖÐÎÄ
English
Ê×Ò³ > ¹ØÓÚÎÒÃÇ > ÐÂÎÅÖÐÐÄ > ´Ó¡°Ö»»á¿´Â·¡±µ½¡°Çé¾³¸ÐÖª¡±£ºPGµç¾ºÐÅÏ¢ÍŶÓICCV×Ô¶¯¼ÝÊ»ÌôÕ½Èü¶á¹Ú·½°¸Ïê½â

´Ó¡°Ö»»á¿´Â·¡±µ½¡°Çé¾³¸ÐÖª¡±£ºPGµç¾ºÐÅÏ¢ÍŶÓICCV×Ô¶¯¼ÝÊ»ÌôÕ½Èü¶á¹Ú·½°¸Ïê½â

2025-11-18

½üÈÕ£¬ÔÚÈ«ÇòȨÍþµÄICCV 2025×Ô¶¯¼ÝÊ»¹ú¼ÊÌôÕ½Èü£¨Autonomous Grand Challenge£©ÖУ¬PGµç¾ºÐÅÏ¢AIÍŶÓËùÌá½»µÄ¡°SimpleVSF¡±£¨Simple VLM-Scoring Fusion£©Ë㷨ģÐÍÒÔ53.06µÄ³öÉ«³É¼¨Õ¶»ñ¶Ëµ½¶Ë×Ô¶¯¼ÝÊ»ÈüµÀ£¨NAVSIM v2 End-to-End Driving Challenge£©µÚÒ»Ãû¡£

SimpleVSFÉî¶ÈÈÚºÏÁË´«Í³¹ì¼£¹æ»®ÓëÊÓ¾õ-ÓïÑÔÄ£ÐÍ£¨Vision-Language Model, VLM£©µÄ¸ß¼¶ÈÏÖªÄÜÁ¦£¬Äܹ»Àí½â¸´ÔӵĽ»Í¨Çé¾³£¬Í»ÆÆÁËÏÖÓж˵½¶Ë×Ô¶¯¼ÝʻģÐÍ¡°Ö»»á¿´Â·¡¢È±·¦Ë¼¿¼¡±µÄ¾ÖÏÞ¡£ÕâµÃÒæÓÚÁ½´ó¹Ø¼ü´´Ð£ºÒ»·½Ã棬ÒýÈëVLMÔöÇ¿´ò·ÖPGµç¾º¹ÙÍø£¬Ê¹´ò·ÖPGµç¾º¹ÙÍø²»ÔÙ½ö½öÒÀÀµÓÚԭʼµÄ´«¸ÐPGµç¾º¹ÙÍøÊý¾Ý£¬¶øÊÇÄܹ»Àí½âÉî²ãµÄ½»Í¨ÒâͼºÍ¡°³£Ê¶¡±£¬´Ó¶øÑ¡³ö¸ü°²È«¡¢¸üºÏÀíµÄ¼ÝÊ»·½°¸£»ÁíÒ»·½Ã棬²ÉÓÃË«ÖØ¹ì¼£ÈںϾö²ß»úÖÆ£¨È¨ÖØÈÚºÏPGµç¾º¹ÙÍøºÍVLMÈÚºÏPGµç¾º¹ÙÍø£©£¬½øÒ»²½Èں϶à¸ö´ò·ÖPGµç¾º¹ÙÍøÑ¡³öµÄ¹ì¼££¬È·±£×îÖÕ¾ö²ß²»½öÊýÖµ×îÓÅ£¬¶øÇÒÓïÒåºÏÀí¡£

±¾ÆªÎÄÕ½«¸ù¾ÝPGµç¾ºÐÅÏ¢Ìá½»µÄ¼¼Êõ±¨¸æ¡°SimpleVSF: VLM-Scoring Fusion for Trajectory Prediction of End-to-End Autonomous Driving¡±£¬Ïê½âÆäʹÓõĴ´Ð¼ܹ¹¡¢ÓÅ»¯´ëÊ©ºÍʵÑé½á¹û¡£

±³¾°ÓëÌôÕ½

½üÄêÀ´£¬×Ô¶¯¼ÝÊ»¼¼Êõ·ÉËÙ·¢Õ¹£¬Õý´Ó´«Í³µÄÄ£¿é»¯Á÷³Ì£¨Modular Pipeline£©Öð²½ÂõÏò¸ü¸ßЧ¡¢¸ü¾ß³°ôÐԵĶ˵½¶Ë£¨End-to-End£©·¶Ê½¡£´«Í³µÄÄ£¿é»¯ÏµÍ³£¨¸ÐÖª¡¢¶¨Î»¡¢¹æ»®¡¢¿ØÖÆ£©ÈÝÒ×ÔÚ¸÷Ä£¿é¼ä»ýÀÛÎó²î£¬ÇÒÃæ¶Ô¸´ÔÓ³¡¾°Ê±£¬ÐÅÏ¢µÄ²ã²ã´«µÝÍùÍùµ¼Ö¾ö²ßÖͺó»ò´ÎÓÅ¡£¶Ëµ½¶Ë·½·¨Ö¼ÔÚͨ¹ýÉñ¾­ÍøÂçÖ±½Ó´Ó´«¸ÐPGµç¾º¹ÙÍøÊäÈëÉú³É¼ÝÊ»¶¯×÷»ò¹ì¼££¬ÊµÏÖÐÅÏ¢Á÷µÄͳһÓëÓÅ»¯¡£È»¶ø£¬ÒªÕæÕýÈûúPGµç¾º¹ÙÍøÏñÈËÀàÒ»ÑùÔÚ¸´ÔÓ»·¾³ÖÐ×ö³ö¡°´ÏÃ÷¡±µÄ¾ö²ß£¬ÈÔÃæÁÙ¾Þ´óµÄ¼¼ÊõÌôÕ½¡£

NAVSIM¿ò¼ÜÖ¼ÔÚͨ¹ýÄ£Äâ»ù´¡µÄÖ¸±êÀ´½â¾öÏÖÓÐÎÊÌ⣬¾ßÌå·½·¨ÊÇÕ¹¿ª³¡¾°¼ò»¯µÄÄñî«Í¼£¨Bird's-Eye View, BEV£©³éÏ󣬲¢ÔÚÒ»¸ö½Ï¶ÌµÄÄ£Äâʱ¼ä·¶Î§ÄÚÍÆÑݳöÐгµ¹ì¼£¡£ÎªÁ˳¬Ô½½öÔÚÈËÀàÊý¾Ý²É¼¯Öй۲쵽µÄ״̬ÏÂÆÀ¹À¼Ýʻϵͳ£¬ NAVSIM v2ÌôÕ½ÈüÒýÈëÁË·´Ó¦Ê½±³¾°½»Í¨²ÎÓëÕߺÍÕæÊµµÄºÏ³ÉÐÂÊÓ½ÇÊäÈ룬ÒÔ±ã¸üºÃµØÆÀ¹ÀÄ£Ð͵ij°ôÐԺͷº»¯ÄÜÁ¦¡£

ĿǰÕë¶Ô¸ÃÀàÈÎÎñµÄÖ÷Á÷·½°¸´óÖ¿ɷÖΪÈýÀà¡£µÚÒ»ÀàÊÇ»ùÓÚTransformer×ԻعéµÄ·½°¸£¬Í¨¹ý·¾¶µãµÄÖðÒ»Ô¤²âµÃµ½Ô¤²â¹ì¼££¬´ú±í¹¤×÷ÊÇTransfuser1¡£µÚ¶þÀàÊÇ»ùÓÚDiffusionµÄ·½°¸£¬Í¨¹ýÔÚÈ¥ÔëʱÒýÈë¸÷ÖÖ¿ØÖÆÔ¼ÊøµÃµ½Ô¤²â¹ì¼££¬´ú±í¹¤×÷ÊÇDiffusionDrive2¡£µÚÈýÀàÊÇ»ùÓÚScorerµÄ·½°¸£¬Í¨¹ý¶ÔÒ»¸öÔ¤¶¨ÒåµÄ¹ì¼£´Ê±í½øÐдò·ÖɸѡµÃµ½Ô¤²â¹ì¼££¬´ú±í¹¤×÷ÊÇGTRS?¡£

·½·¨½éÉÜ

PGµç¾ºÐÅÏ¢AIÍŶÓÌá³öÁËSimpleVSF¿ò¼Ü£¬ÆäºËÐÄ´´ÐÂÔÚÓÚÒýÈëÁËÊÓ¾õ-ÓïÑÔÄ£ÐÍ£¨VLM£©×÷Ϊ¸ß²ãÈÏÖªÒýÇæ£¬²¢Éè¼ÆÁËË«ÖØÈںϲßÂÔ£¬½«VLMµÄÓïÒåÀí½âÄÜÁ¦¸ßЧµØ×¢Èëµ½¹ì¼£ÆÀ·ÖÓëÑ¡ÔñµÄÈ«Á÷³ÌÖС£

×Ô¶¯¼ÝÊ»Ë㷨ģÐÍSimpleVSFÕûÌå¼Ü¹¹Í¼ .jpg

SimpleVSFÕûÌå¼Ü¹¹Í¼ 

SimpleVSF¿ò¼Ü¿ÉÒÔ·ÖΪÈý¸öÏ໥Э×÷µÄÄ£¿é£º

01 »ù´¡£º»ùÓÚÀ©É¢Ä£Ð͵Ĺì¼£ºòÑ¡Éú³É

¿ò¼ÜµÄµÚÒ»²½ÊǸßЧµØÉú³ÉÒ»Ì×¶àÑù»¯¡¢¸ßÖÊÁ¿µÄºòÑ¡¹ì¼£¼¯ºÏ¡£

¼¼ÊõÑ¡ÐÍ£º²ÉÓÃÀ©É¢Ä£ÐÍ£¨Diffusion-based Trajectory Generator£©¡£

×÷ÓãºÀ©É¢Ä£ÐÍ»ùÓÚ×Ô³µ×´Ì¬ºÍ»·¾³µÄÄñî«Í¼£¨BEV£©±íʾ½øÐÐÌõ¼þÉú³É¡£ÆäÓÅÊÆÔÚÓÚÄܹ»²¶×½¹ì¼£·Ö²¼µÄ¶àģ̬ÐÔ£¬Éú³ÉһϵÁÐÔÚÔ˶¯Ñ§ÉÏ¿ÉÐÐÇÒ¾ßÓвîÒìÐÔµÄêµã£¨Anchors£©£¬ÎªºóÐøµÄ¾«È·ÆÀ¹ÀÌṩ³ä×ãµÄ¡°±¸Ñ¡·½°¸¡±¡£

02 ºËÐÄ£ºVLMÔöÇ¿µÄ»ìºÏÆÀ·Ö»úÖÆ(VLM-Enhanced Scoring)

SimpleVSF²ÉÓÃÁË»ìºÏÆÀ·Ö²ßÂÔ£¬Ëü´î½¨Á˸߲ãÓïÒåÓëµÍ²ã¼¸ºÎÖ®¼äµÄÇÅÁº¡£Æä¹¤×÷Ô­ÀíÈçÏ£º

A. ÓïÒåÊäÈ룺ÀûÓÃÒ»¸ö¾­¹ý΢µ÷µÄVLM£¨Qwen2VL-2B4£©×÷ΪÓïÒå´¦ÀíPGµç¾º¹ÙÍø¡£VLM½ÓÊÕÒÔÏÂÈýÖÖÐÅÏ¢

¡ö Ç°ÊÓÉãÏñͷͼÏñ£ºÌṩ³¡¾°µÄÊÓ¾õϸ½Ú£»

¡ö ×Ô³µ×´Ì¬£ºÊµÊ±ËÙ¶È¡¢¼ÓËٶȵÈÎïÀíÁ¿£»

¡ö ¸ß²ã¼ÝʻָÁ ¹æ»®ÏµÍ³ÊäÈëµÄ³éÏóÖ¸ÁÈç¡°×óת¡±¡¢¡°ÏòǰÐÐÊ»¡±µÈ¡£

B. Êä³öÈÏÖªÖ¸ÁVLM¸ù¾ÝÕâЩÊäÈ룬Êä³öÈÏÖªÖ¸ÁCognitive Directives£©¡£ÕâЩָÁîÊǸ߲ãµÄ¡¢ÀàËÆÓÚÈËÀà˼¿¼µÄ³éÏó¸ÅÄÀýÈç

¡ö ×ÝÏòÖ¸Á ¡°±£³ÖËÙ¶È¡±¡¢¡°¼ÓËÙ¡±¡¢¡°»ºÂý¼õËÙ¡±¡¢¡°Í£³µ¡±£»

¡ö ºáÏòÖ¸Á ¡°±£³Ö³µµÀÖÐÐÄ¡±¡¢¡°Î¢µ÷Ïò×󡱡¢¡°´ó½Ç¶ÈÓÒת¡±¡£

C. ¿ÉѧϰµÄÌØÕ÷ÈںϣºÕâЩ³éÏóµÄÓïÑÔ/Ö¸ÁÈç¡°Í£³µ¡±£©Ê×ÏÈͨ¹ýÒ»¸ö¿ÉѧϰµÄ±àÂë²ã£¨Cognitive Directives Encoder£©£¬±»ÇÉÃîµØ×ª»»ÎªÃܼ¯µÄÊýÖµÌØÕ÷¡£Õâ¸öVLMÌØÕ÷ËæºóÓë×Ô³µ×´Ì¬ºÍ´«Í³¸ÐÖªÊäÈëÆ´½Ó£¨Concatenated£©£¬¹²Í¬×÷Ϊ¹ì¼£ÆÀ·ÖPGµç¾º¹ÙÍø½âÂëµÄÊäÈ롣ͨ¹ýÕâÖÖÏÔʽÈںϣ¬VLMµÄ¸ß²ãÓïÒåÀí½â²»ÔÙÊÇÄ£ÐÍÒþº¬µÄÌØÐÔ£¬¶øÊÇÖ±½Ó²ÎÓëµ½¹ì¼£µÄÊýÖµ´ú¼ÛPGµç¾º¹ÙÍøÖС£

03 ±£ÕÏ£ºË«Öع켣ÈںϲßÂÔ£¨Trajectory Fusion£©

ΪÁËʵÏÖ³°ô¡¢Æ½ºâµÄ×îÖÕ¾ö²ß£¬SimpleVSF ²ÉÓÃÁËÁ½ÖÖÈںϻúÖÆÀ´±£ÕÏ×îÖÕÊä³ö¹ì¼£µÄÖÊÁ¿¡£

A. Á¿»¯ÈںϣºÈ¨ÖØÈÚºÏPGµç¾º¹ÙÍø£¨Weight Fusioner, WF£©

»úÖÆ£ºÕâÊÇÒ»¸ö»ùÓÚ¶¨Á¿ÑϽ÷ÐÔµÄÖ÷»úÖÆ¡£Ëü¸ºÔð½«À´×Ô¶à¸öÆÀ·ÖPGµç¾º¹ÙÍøºÍ¶à¸öÄ£ÐÍ£¨°üÀ¨VLMÔöÇ¿ÆÀ·ÖPGµç¾º¹ÙÍøºÍ´«Í³ÆÀ·ÖPGµç¾º¹ÙÍø£©µÄµÃ·Ö½øÐиßЧ¾ÛºÏ¡£

ÈÚºÏÁ÷³Ì£º

¡ö Ö¸±ê¾ÛºÏ£º½«µ¥¸ö¹ì¼£ÔÚ²»Í¬Î¬¶È£¨ÈçÅöײ·çÏÕ¡¢ÊæÊʶÈ¡¢Ð§ÂÊ£©Éϵĵ÷ֽøÐгõ´Î¾ÛºÏ¡£

¡ö Ä£Ð;ۺÏ£º²ÉÓö¯Ì¬¼ÓȨ·½°¸£¬¸ù¾Ýµ±Ç°³¡¾°µÄÖØÒªÐÔ£¬¶¯Ì¬µØµ÷ÕûÀ´×Ô²»Í¬Ä£ÐÍ£¨Èç¶à¸öVLMÔöÇ¿ÆÀ·ÖPGµç¾º¹ÙÍø£©µÄ¾ÛºÏµÃ·ÖµÄÈ¨ÖØ¡£

×÷ÓãºÈ·±£ÁËÔÚ´ó¶àÊý³£¹æ³¡¾°Ï£¬×îÖյľö²ßÊÇ»ùÓÚ¶à·½ÊäÈ롢ͳ¼ÆÑ§ÉÏ×î¿É¿¿µÄÑ¡Ôñ¡£

B. ÖÊÐÔÈںϣºVLMÈÚºÏPGµç¾º¹ÙÍø£¨VLM Fusioner, VLMF£©

»úÖÆ£ºÖ¼ÔÚͨ¹ýVLMµÄ¶¨ÐÔÍÆÀíÄÜÁ¦½øÐÐ×îÖÕµÄÓïÒ徫Á¶¡£

ÈÚºÏÁ÷³Ì£º

¡ö ¹ì¼£¾«Ñ¡£º´Óÿһ¸ö¶ÀÁ¢ÆÀ·ÖPGµç¾º¹ÙÍøÖУ¬Ñ¡³öÅÅÃû×î¸ßµÄ¹ì¼£¡£

¡ö LQR Ä£ÄâÓëäÖȾ£ºÕâЩ¾«Ñ¡¹ì¼£Í¨¹ý LQR Ä£ÄâPGµç¾º¹ÙÍø½øÐÐÆ½»¬´¦Àí£¬È·±£Ô˶¯Ñ§¿ÉÐÐÐÔ¡£È»ºó£¬ËüÃDZ»¿ÉÊÓ»¯²¢äÖȾµ½µ±Ç°µÄǰÊÓÉãÏñͷͼÏñÉÏ£¬ÐγÉÒ»¸ö°üº¬¡°Ç±ÔÚÐж¯·½°¸¡±µÄÊÓ¾õÐÅϢͼ¡£

¡ö ½«°üº¬äÖȾ¹ì¼£µÄͼÏñÒÔ¼°Îı¾Ö¸ÁîÌá½»¸øÒ»¸ö¸ü´ó¡¢ÄÜÁ¦¸üÇ¿µÄ VLM Ä£ÐÍ£¨Qwen2.5VL-72B5£©£¬²¢Ã÷È·ÒªÇóVLM¸ù¾Ý³¡¾°ºÍÖ¸Á¶¨ÐÔÑ¡Ôñ³ö¡°×îºÏÀí¡±µÄ¹ì¼£¡£

×÷Ó㺠¸³ÓèÁËϵͳһµÀÓïÒåУÑ鹨¿¨£¬È·±£×îÖÕ¾ö²ß²»½öÊýÖµ×îÓÅ£¬¸üÔڸ߲ãÈÏÖªºÍ³£Ê¶ÉϺÏÀí¡£

VLMÈÚºÏPGµç¾º¹ÙÍøµÄ¹ì¼£ÈÚºÏÁ÷³Ì.jpgVLMÈÚºÏPGµç¾º¹ÙÍøµÄ¹ì¼£ÈÚºÏÁ÷³Ì

ʵÑé½á¹û

ΪÑéÖ¤ÓÅ»¯´ëÊ©µÄÓÐЧÐÔ£¬PGµç¾ºÐÅÏ¢AIÍŶÓÔÚNavhardÊý¾Ý×Ó¼¯ÉϽøÐÐÁËÏûÈÚʵÑ飬½á¹ûÈçϱíËùʾ¡£ÒÔVersion A×÷Ϊ»ùÏߣ¨baseline£©¡£

×Ô¶¯¼ÝÊ»Ë㷨ģÐÍSimpleVSFÔÚNavhardÊý¾Ý×Ó¼¯²»Í¬ÉèÖÃϵÄÏûÈÚʵÑé.jpgSimpleVSFÔÚNavhardÊý¾Ý×Ó¼¯²»Í¬ÉèÖÃϵÄÏûÈÚʵÑé

¡ö ÔÚ²»Í¬ÌØÕ÷ÌáÈ¡ÍøÂçµÄÓ°Ïì·½Ãæ£¬PGµç¾ºÐÅÏ¢AIÍŶÓʹÓÃÁËÈýÖÖ²»Í¬µÄBackbones£¬¼´V2-996¡¢EVA-ViT-L7¡¢ViT-L8£¬·Ö±ð¶ÔÓ¦Version A¡¢Version B¡¢Version C¡£½á¹û±íÃ÷£¬BackbonesµÄÑ¡Ôñ¶ÔÐÔÄÜÆð×ÅÖØÒª×÷Óá£ViT-LÃ÷ÏÔÓÅÓÚÆäËûBackbones¡£

¡ö ÔÚVLM ÔöÇ¿ÆÀ·ÖPGµç¾º¹ÙÍøµÄÓÐЧÐÔ·½Ã棬Version DºÍVersion E ¼¯³ÉÁËVLM ÔöÇ¿ÆÀ·ÖPGµç¾º¹ÙÍø£¬Version DÓÅÓÚ¶ÔÓ¦µÄÏàͬbackboneµÄ´«Í³ÆÀ·ÖPGµç¾º¹ÙÍøVersion A£¬Ö¤Ã÷ÁËÓïÒåÖ¸µ¼µÄ¼ÛÖµ¡£ËäÈ»Version EµÄ¸öÌåÐÔÄÜÓë¶ÔÓ¦µÄÏàͬbackboneµÄ´«Í³ÆÀ·ÖPGµç¾º¹ÙÍøVersion CÏà±ÈÂԵͣ¬µ«VLMÔöÇ¿ÆÀ·ÖPGµç¾º¹ÙÍøµÄÕæÕýÓÅÊÆÔÚÓÚËüÃǵÄÈÚºÏDZÁ¦¡£

¡ö Ôڹ켣ÈںϲßÂÔµÄÐÔÄÜ·½Ã棬ͨ¹ýÈںϲßÂÔ£¬PGµç¾ºÐÅÏ¢AIÍŶӹ۲쵽ÁË×îÏÔÖøµÄÐÔÄÜÌáÉý¡£WF B+C+D+EÔÚNavhardÊý¾Ý¼¯ÉÏÈ¡µÃÁË47.18µÄEPDMSµÃ·Ö¡£×îÖÕ£¬PGµç¾ºÐÅÏ¢AIÍŶÓÔÚPrivate_test_hard·Ö¸îÊý¾Ý¼¯ÉÏҲʹÓÃÁËÕâËĸöÆÀ·ÖPGµç¾º¹ÙÍøµÄÈںϽá¹û¡£VLMF A+B+CҲȡµÃÁËÁîÈËÓ¡ÏóÉî¿ÌµÄ EPDMS 47.68£¬µ«ÓÉÓÚÌá½»¹æÔòÏÞÖÆ£¬Î´ÔÚ×îÖÕµÄÅÅÐаñÌá½»ÖÐʹÓôËÈںϲßÂÔ¡£

×Ô¶¯¼ÝÊ»Ë㷨ģÐÍSimpleVSFÔÚ¾ºÈüPrivate_test_hardÊý¾Ý×Ó¼¯ÉϵıíÏÖ.jpg

SimpleVSFÔÚ¾ºÈüPrivate_test_hardÊý¾Ý×Ó¼¯ÉϵıíÏÖ

ÔÚ×îÖÕ°ñµ¥µÄPrivate_test_hard·Ö¸îÊý¾Ý¼¯ÉÏ£¬PGµç¾ºÐÅÏ¢AIÍŶÓÌá³öµÄSimpleVSF¿ò¼ÜÔÚÅÅÐаñÉÏ»ñµÃÁ˵ÚÒ»Ãû£¬È¡µÃÁË53.06µÄ×ÜEPDMS·ÖÊý¡£¶ÔÓÚStage I£¬ËüÔÚTLC£¨½»Í¨µÆºÏ¹æÐÔ£©ÉÏ»ñµÃÁË100·Ö£¬ÔÚDAC£¨¿É¼ÝÊ»ÇøÓòºÏ¹æÐÔ£©ºÍDDC£¨¼ÝÊ»·½ÏòºÏ¹æÐÔ£©ÉÏ»ñµÃÁË99.29 ·Ö£¬ÕâչʾÁËÄ£Ð͵ij°ôÐÔ¼°Æä¶Ô¹Ø¼ü½»Í¨¹æÔòµÄ×ñÊØÄÜÁ¦¡£¶ÔÓÚStage IºÍStage II£¬PGµç¾ºÐÅÏ¢AIÍŶӵÄNC£¨ÎÞ¹ýʧÅöײ£©·ÖÊýÔÚËùÓвÎÈüÍŶÓÖд¦ÓÚÁìÏȵØÎ»¡£ËäÈ»ÆäËû·½·¨¿ÉÄÜÔÚijЩ·½Ãæ±íÏÖ³öÉ«£¬µ«PGµç¾ºÐÅÏ¢AIÍŶӵÄSimpleVSFÔÚÖ¸±êÉÏʵÏÖÁË×ÛºÏÆ½ºâ¡£

×ܽá

±¾ÎĽéÉÜÁË»ñµÃ¶Ëµ½¶Ë×Ô¶¯¼ÝÊ»ÈüµÀµÚÒ»ÃûµÄ¡°SimpleVSF¡±Ë㷨ģÐÍ¡£SimpleVSF¿ò¼Ü³É¹¦µØ½«ÊÓ¾õ-ÓïÑÔÄ£ÐÍ´Ó´¿´âµÄÎı¾/ͼÏñÉú³ÉÈÎÎñÖÐÒýÈëµ½×Ô¶¯¼ÝÊ»µÄºËÐľö²ßÑ­»·£¬Íê³ÉÁË´Ó¡°¸ÐÖª-Ðж¯¡±µ½¡°¸ÐÖª-ÈÏÖª-Ðж¯¡±µÄÉýά¡£

²Î¿¼ÎÄÏ×£º

1¡¢Chitta, K.;  Prakash, A.;  Jaeger, B.;  Yu, Z.;  Renz, K.; Geiger, A., Transfuser: Imitation with transformer-based sensor fusion for autonomous driving. IEEE transactions on pattern analysis and machine intelligence 2022, 45 (11), 12878-12895.

2¡¢Liao, B.;  Chen, S.;  Yin, H.;  Jiang, B.;  Wang, C.;  Yan, S.;  Zhang, X.;  Li, X.;  Zhang, Y.; Zhang, Q. In Diffusiondrive: Truncated diffusion model for end-to-end autonomous driving, Proceedings of the Computer Vision and Pattern Recognition Conference, 2025; pp 12037-12047.

3¡¢ Li, Z.;  Yao, W.;  Wang, Z.;  Sun, X.;  Chen, J.;  Chang, N.;  Shen, M.;  Wu, Z.;  Lan, S.; Alvarez, J. M., Generalized Trajectory Scoring for End-to-end Multimodal Planning. arXiv preprint arXiv:2506.06664 2025.

4¡¢Wang, P.;  Bai, S.;  Tan, S.;  Wang, S.;  Fan, Z.;  Bai, J.;  Chen, K.;  Liu, X.;  Wang, J.; Ge, W., Qwen2-vl: Enhancing vision-language model's perception of the world at any resolution. arXiv preprint arXiv:2409.12191 2024.

5¡¢ Bai, S.;  Chen, K.;  Liu, X.;  Wang, J.;  Ge, W.;  Song, S.;  Dang, K.;  Wang, P.;  Wang, S.; Tang, J., Qwen2. 5-vl technical report. arXiv preprint arXiv:2502.13923 2025.

6¡¢ Lee, Y.;  Hwang, J.-w.;  Lee, S.;  Bae, Y.; Park, J. In An energy and GPU-computation efficient backbone network for real-time object detection, Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, 2019; pp 0-0.

7¡¢Fang, Y.;  Sun, Q.;  Wang, X.;  Huang, T.;  Wang, X.; Cao, Y., Eva-02: A visual representation for neon genesis. Image and Vision Computing 2024, 149, 105171.

8¡¢ Dosovitskiy, A.;  Beyer, L.;  Kolesnikov, A.;  Weissenborn, D.;  Zhai, X.;  Unterthiner, T.;  Dehghani, M.;  Minderer, M.;  Heigold, G.; Gelly, S., An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 2020. 

ÊÛǰ×Éѯ

ÊÛºó·þÎñ

Òâ¼û·´À¡

AIStore

»Øµ½¶¥²¿

»Øµ½¶¥²¿

ÊÕÆð
»Øµ½¶¥²¿ »Øµ½¶¥²¿
ÇëÑ¡Ôñ·þÎñÏîÄ¿
ÊÛǰ×Éѯ
ÊÛºó·þÎñ
·ÃÎÊ AIStore

ɨÂë·ÃÎÊAIStore

¡¾ÍøÕ¾µØÍ¼¡¿¡¾sitemap¡¿