yjyoon / YOLOv9 star

Enter project description

README.md

주요 수정 사항#

  1. pytorch 2.6 버전에서 trainig, validation기능이 작동할 수 있도록 torch.load 에 weights_only=False 로 일단 잡아둠
  2. 기타 업데이트된 라이브러리로 인한 자잘한 충돌 여러가지 수정 (특히 PIL)
  3. protomask 처리 부분 오류 수정 (하드코드되어있었음)
  4. tiny와 small 크기의 segmentation pre-trained 모델 추가
  5. 가중치 필요시 윤영준 주임에게 요청
    기준tinysmall
    640x640 이미지 기준13.2GFLOPs30GFLOPs
    SAMA-COCO dataset 기준42.2mAP47.3mAP
    epoch32067
  • SAMA-COCO는 COCO 데이터셋의 오류를 보완하여 한 번 더 검수 및 레이블링 작업을 한 데이터셋. mask에 대한 refining을 진행하고, 특히, CROWDED로 한번에 묶어서 처리된 군중/밀집된 개체들의 레이블링 데이터를 전부 개별로 풀어두었다. 특히 이들 가중치들은 사람에 대하여 훌륭한 인식율을 보인다. (60% 이상)
  • 그러나 이는 데이터의 불균형을 더 심화하는데 일조하였으며, 이를 해결하기 위해서는 추가적인 데이터나 특정 테스크로의 transfer-learning, 혹은 data-augmentation 파이프라인 수정을 통한 숫자가 적은 개체에 대한 대응이 필요하다.
fromnparamsmodulearguments
-11464models.common.Conv[3, 16, 3, 2]
-114672models.common.Conv[16, 32, 3, 2]
-117872models.common.ELAN1[32, 32, 32, 16]
-1118560models.common.AConv[32, 64]
-1165216models.common.RepNCSPELAN4[64, 64, 64, 32, 3]
-1155488models.common.AConv[64, 96]
-11145824models.common.RepNCSPELAN4[96, 96, 96, 48, 3]
-11110848models.common.AConv[96, 128]
-11258432models.common.RepNCSPELAN4[128, 128, 128, 64, 3]
-1141344models.common.SPPELAN[128, 128, 64]
-110torch.nn.modules.upsampling.Upsample[None, 2, 'nearest']
[-1, 6]10models.common.Concat[1]
-11158112models.common.RepNCSPELAN4[224, 96, 96, 48, 3]
-110torch.nn.modules.upsampling.Upsample[None, 2, 'nearest']
[-1, 4]10models.common.Concat[1]
-1171360models.common.RepNCSPELAN4[160, 64, 64, 32, 3]
-1127744models.common.AConv[64, 48]
[-1, 12]10models.common.Concat[1]
-11150432models.common.RepNCSPELAN4[144, 96, 96, 48, 3]
-1155424models.common.AConv[96, 64]
[-1, 9]10models.common.Concat[1]
-11266624models.common.RepNCSPELAN4[192, 128, 128, 64, 3]
15145376models.common.RepNCSPELAN4[64, 64, 64, 32, 1]
-110torch.nn.modules.upsampling.Upsample[None, 2, 'nearest']
-1155488models.common.Conv[64, 96, 3, 1]
[15, 18, 21, 24]11055184models.yolo.DSegment[80, 16, 128, [64, 96, 128, 96]]

gelan-t-seg_v1 summary: 1015 layers, 2594464 parameters, 2594448 gradients, 13.8 GFLOPs

yolov9-t (13 GFLOPs)#

             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100%|██████████| 79/79 00:38
               all       5000      58597      0.559        0.4      0.422      0.301      0.557       0.38        0.4      0.251
            person       5000      19918      0.713      0.532      0.609      0.393      0.702      0.499      0.565      0.309
           bicycle       5000        445      0.514      0.238      0.239      0.137      0.461      0.209      0.198     0.0841
               car       5000       2825      0.663      0.426      0.484      0.302      0.661      0.405      0.453      0.243
        motorcycle       5000        668      0.679      0.322      0.388      0.232      0.613      0.277      0.329      0.172
          airplane       5000        186       0.73      0.575      0.633      0.506      0.707      0.543      0.577      0.386
               bus       5000        447      0.626      0.456      0.499      0.414      0.649       0.45      0.489      0.374
             train       5000        370      0.679      0.451      0.496      0.386      0.671      0.432       0.48      0.366
             truck       5000        602      0.378      0.229      0.218      0.142        0.4      0.224      0.215       0.13
              boat       5000        473      0.535       0.29      0.333      0.197      0.512       0.26      0.295      0.145
     traffic light       5000        524      0.557      0.365      0.396      0.217      0.552      0.342      0.368      0.165
      fire hydrant       5000         97      0.695      0.722      0.747      0.576      0.699      0.711      0.729      0.557
         stop sign       5000         56      0.711      0.768      0.778      0.728      0.723      0.768      0.777      0.705
     parking meter       5000         53      0.636      0.491      0.534      0.417      0.656      0.491      0.534      0.416
             bench       5000        770      0.489       0.16      0.181      0.118      0.481      0.144      0.157     0.0765
              bird       5000        587      0.529        0.4      0.431       0.25      0.527      0.368      0.389      0.203
               cat       5000        242      0.724      0.678      0.717      0.607      0.738      0.674      0.725      0.595
               dog       5000        274      0.585      0.526       0.52       0.43      0.604      0.526      0.526      0.403
             horse       5000        526      0.527       0.39      0.404      0.272      0.539      0.373      0.389      0.212
             sheep       5000        462      0.555       0.57      0.564      0.373      0.569      0.558      0.548      0.302
               cow       5000        500      0.517      0.464      0.476      0.325        0.5      0.428      0.441       0.27
          elephant       5000        488      0.607      0.605      0.624      0.459      0.605      0.584      0.605      0.392
              bear       5000        158      0.711      0.348      0.395       0.33      0.729      0.342      0.393      0.314
             zebra       5000        428      0.757        0.6      0.681      0.491      0.755      0.577      0.651      0.406
           giraffe       5000        354      0.825      0.647      0.729      0.568      0.822       0.63      0.702      0.449
          backpack       5000        533      0.367      0.126      0.141     0.0707      0.386      0.122      0.139     0.0646
          umbrella       5000        613        0.6      0.354       0.38       0.25       0.64      0.356       0.39      0.248
           handbag       5000        729      0.415      0.126      0.139     0.0829      0.457      0.128      0.145     0.0723
               tie       5000        290      0.723      0.334      0.387      0.272      0.732      0.324      0.373      0.236
          suitcase       5000        409      0.536      0.286      0.324      0.212      0.554      0.273       0.32      0.188
           frisbee       5000        131       0.68      0.616      0.659      0.518      0.708      0.595      0.657      0.457
              skis       5000        459      0.549      0.266      0.303      0.153      0.548      0.244      0.267     0.0853
         snowboard       5000         74      0.442      0.311       0.33      0.241      0.466      0.297      0.312      0.177
       sports ball       5000        221      0.631      0.466      0.489      0.352      0.627      0.448      0.468      0.258
              kite       5000        257      0.513      0.611      0.564      0.387      0.501      0.564      0.499       0.26
      baseball bat       5000        254      0.527      0.259      0.269      0.184      0.535      0.256      0.262      0.142
    baseball glove       5000        146      0.657      0.445      0.469       0.31      0.679      0.438       0.48      0.282
        skateboard       5000        226      0.645      0.506      0.522      0.347       0.64      0.496      0.507      0.238
         surfboard       5000        409      0.624      0.377      0.411      0.273      0.632      0.364      0.392      0.223
     tennis racket       5000        339      0.656      0.445        0.5      0.359      0.645      0.422       0.45       0.28
            bottle       5000       1345      0.567      0.338      0.375      0.237      0.565      0.314      0.354      0.192
        wine glass       5000        449      0.666      0.265      0.341      0.219        0.7      0.267      0.331      0.181
               cup       5000        937      0.467      0.386      0.382      0.269      0.483      0.379      0.378      0.249
              fork       5000        271      0.443       0.25      0.265      0.195      0.389      0.199      0.198     0.0829
             knife       5000        315       0.39      0.157      0.157        0.1      0.363      0.133      0.126     0.0696
             spoon       5000        306      0.294      0.101        0.1     0.0595      0.292     0.0882     0.0741     0.0372
              bowl       5000        659        0.5      0.355      0.346      0.257      0.434      0.294      0.256      0.142
            banana       5000       1910      0.599      0.491      0.496      0.278       0.53       0.41      0.381      0.158
             apple       5000        773      0.404      0.445       0.32      0.207       0.41      0.426      0.308      0.157
          sandwich       5000        141      0.444      0.404      0.352      0.262      0.493      0.411      0.377      0.252
            orange       5000        550      0.516      0.503      0.505      0.396      0.529      0.482      0.484      0.345
          broccoli       5000        436      0.467      0.475      0.441      0.253      0.471      0.448      0.425       0.23
            carrot       5000       1532       0.56      0.319      0.342      0.177      0.511      0.269      0.282      0.121
           hot dog       5000        109      0.339      0.459       0.38      0.261      0.293      0.367      0.285      0.171
             pizza       5000        312       0.68      0.593      0.623      0.515      0.686      0.582      0.614      0.481
             donut       5000        499      0.532      0.495       0.51      0.389      0.525      0.471      0.493      0.331
              cake       5000        438      0.474      0.342      0.352      0.223      0.476      0.324      0.349       0.21
             chair       5000       3480       0.54      0.275      0.311      0.181      0.522      0.249      0.265      0.111
             couch       5000        523      0.471      0.314       0.32      0.241      0.451      0.283        0.3      0.195
      potted plant       5000        360      0.395      0.222      0.212        0.1      0.386      0.194      0.185     0.0692
               bed       5000        310      0.547      0.294      0.326      0.237      0.551      0.277      0.309      0.189
      dining table       5000       1615      0.446      0.285      0.283      0.191      0.449      0.264      0.258      0.136
            toilet       5000        226      0.664       0.58      0.603      0.499      0.693      0.584      0.601      0.466
                tv       5000        305       0.65       0.61      0.642      0.496      0.661      0.597      0.629      0.438
            laptop       5000        288       0.65      0.503      0.543      0.457       0.65      0.486      0.519        0.3
             mouse       5000        112      0.594      0.661      0.669      0.534      0.624      0.651      0.665      0.475
            remote       5000        279      0.494      0.208       0.24      0.149      0.486       0.19      0.222      0.121
          keyboard       5000        238       0.68      0.474      0.566      0.414      0.691      0.459      0.559      0.394
        cell phone       5000        327      0.518      0.339      0.335      0.243      0.497      0.312      0.318      0.218
         microwave       5000         58      0.585      0.535      0.578      0.483      0.592      0.534      0.589      0.447
              oven       5000        149      0.548      0.369      0.396       0.27      0.572      0.362      0.388      0.222
           toaster       5000         12      0.773      0.288      0.352      0.202      0.766      0.277      0.352      0.192
              sink       5000        252      0.556       0.44      0.473      0.345      0.574      0.444      0.472      0.308
      refrigerator       5000        157      0.609      0.459      0.498      0.392      0.623      0.452      0.495       0.36
              book       5000       1082      0.398      0.308      0.283      0.152       0.31      0.217      0.173     0.0692
             clock       5000        281      0.647      0.584      0.581      0.422      0.668      0.581      0.587      0.392
              vase       5000        642      0.512       0.24      0.269      0.167       0.53      0.234      0.257      0.147
          scissors       5000         43       0.39      0.209      0.209      0.188      0.372      0.186      0.187     0.0807
        teddy bear       5000        254      0.668      0.449      0.505      0.346      0.654      0.423      0.481      0.303
        hair drier       5000         12          0          0     0.0607     0.0286          0          0     0.0508     0.0167
        toothbrush       5000         67      0.517      0.224      0.262       0.15      0.491      0.164      0.227      0.104

yolov8-n (12.6 GFLOPs), ultralytics (do not use for any production, AGPL)#

             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100%|██████████| 313/313 [00:22<00:00, 13.90
               all       5000      58597       0.57      0.386       0.41      0.295      0.579      0.368      0.396      0.253
            person       2771      19918      0.727      0.405      0.489      0.328      0.732      0.386      0.465      0.266
           bicycle        156        445      0.447      0.229      0.224      0.128      0.442      0.204      0.201     0.0767
               car        575       2825      0.607      0.401       0.44      0.276      0.617      0.377      0.417      0.222
        motorcycle        164        668      0.644      0.333      0.374       0.22      0.649      0.311      0.352      0.175
          airplane         98        186      0.697      0.602      0.659      0.527      0.724      0.608      0.651      0.431
               bus        190        447      0.704      0.445      0.482      0.378      0.712      0.437      0.474      0.356
             train        155        370      0.752      0.389      0.432       0.32      0.759      0.381      0.426      0.323
             truck        230        602      0.426      0.261      0.249       0.16      0.443      0.251      0.246      0.145
              boat        111        473      0.464      0.339      0.299      0.174      0.467      0.302      0.285      0.139
     traffic light        170        524      0.499      0.437      0.443      0.252      0.497      0.399      0.396      0.175
      fire hydrant         81         97      0.801      0.753      0.795      0.638      0.809      0.743      0.788        0.6
         stop sign         53         56      0.581      0.768      0.773      0.712      0.611      0.768      0.773      0.681
     parking meter         32         53      0.731      0.604      0.664      0.558      0.765      0.604      0.682      0.535
             bench        218        770      0.469      0.153      0.161      0.104       0.47      0.144      0.149     0.0741
              bird        132        587      0.643      0.354      0.418      0.265       0.65      0.327       0.39      0.209
               cat        185        242      0.723      0.702      0.734      0.592      0.741      0.698      0.734      0.601
               dog        184        274       0.66      0.567      0.603      0.497      0.678      0.558      0.597       0.47
             horse        130        526       0.61      0.312      0.346      0.225      0.611      0.308      0.332      0.176
             sheep         57        462       0.54      0.535      0.517      0.363      0.548      0.517        0.5       0.29
               cow         83        500      0.623      0.489      0.507      0.365      0.606      0.454      0.478        0.3
          elephant         90        488      0.796      0.506      0.556      0.416      0.815      0.507      0.559       0.37
              bear         53        158      0.759      0.379      0.402      0.341      0.749      0.361      0.385      0.324
             zebra         86        428      0.763      0.558      0.618      0.466      0.775      0.547      0.601      0.398
           giraffe        104        354      0.851      0.579      0.636      0.501      0.843      0.571      0.627      0.425
          backpack        261        533      0.393      0.144      0.144     0.0777      0.393      0.124      0.139     0.0712
          umbrella        175        613      0.601      0.367      0.392      0.246      0.639       0.37      0.406      0.256
           handbag        334        729      0.429      0.133      0.151     0.0872      0.497      0.137      0.158     0.0808
               tie        142        290      0.574      0.341      0.382      0.257      0.603      0.331      0.372      0.225
          suitcase        102        409      0.507      0.301      0.353       0.24      0.527      0.286      0.344      0.222
           frisbee         84        131      0.754      0.641      0.683      0.543      0.762      0.611      0.674      0.437
              skis        114        459      0.239      0.102     0.0929     0.0395      0.236     0.0915     0.0637     0.0167
         snowboard         45         74      0.507      0.351      0.358      0.259      0.478      0.311      0.329      0.188
       sports ball        152        221      0.578      0.516      0.522      0.367      0.568      0.475      0.463      0.228
              kite         80        257      0.466      0.646      0.546      0.369      0.461      0.599      0.503      0.254
      baseball bat        101        254      0.486       0.24      0.232      0.129      0.565      0.252      0.259       0.13
    baseball glove         97        146      0.583      0.507      0.513      0.337      0.572      0.466      0.493      0.292
        skateboard        127        226      0.628       0.54      0.541      0.369      0.596      0.509      0.533       0.26
         surfboard        147        409      0.521      0.328      0.318      0.196      0.589       0.34      0.342      0.166
     tennis racket        168        339      0.665      0.433       0.46      0.298      0.677      0.419       0.44      0.281
            bottle        414       1345        0.6       0.36      0.402      0.257      0.598      0.325      0.378      0.212
        wine glass        115        449      0.613      0.276      0.342      0.214      0.673      0.272      0.334      0.181
               cup        383        937      0.497      0.412      0.417      0.303      0.507      0.392      0.405      0.269
              fork        148        271      0.507      0.296      0.293      0.207        0.5      0.262      0.241      0.108
             knife        168        315      0.375      0.184      0.172      0.105      0.341      0.152      0.143     0.0791
             spoon        143        306      0.327      0.127      0.117     0.0738       0.32      0.111     0.0972     0.0495
              bowl        269        659      0.457      0.412      0.383      0.286      0.391      0.331      0.261      0.151
            banana         97       1910      0.374     0.0545      0.132     0.0697      0.333      0.044     0.0944     0.0436
             apple         65        773      0.462     0.0899       0.19      0.127       0.46     0.0828      0.176        0.1
          sandwich         75        141      0.375      0.496      0.389      0.289      0.413      0.496      0.408      0.287
            orange         79        550      0.601      0.299      0.414      0.326      0.628      0.291      0.405      0.295
          broccoli         61        436      0.542      0.346      0.384      0.225      0.561      0.326      0.374      0.203
            carrot         68       1532      0.557      0.122       0.21      0.119       0.59       0.11      0.191     0.0935
           hot dog         37        109      0.551      0.541       0.54      0.372      0.532       0.45       0.46      0.292
             pizza        142        312       0.67      0.606      0.645      0.525      0.695       0.59      0.641      0.499
             donut         50        499      0.622      0.419      0.481       0.38      0.632      0.395      0.469      0.337
              cake        123        438      0.575      0.352      0.411      0.268      0.596      0.336      0.412      0.264
             chair        601       3480      0.564      0.216       0.27      0.169       0.54      0.187      0.235      0.102
             couch        203        523      0.447      0.251      0.247      0.177       0.47      0.241      0.245      0.145
      potted plant        163        360      0.364      0.289      0.227     0.0971      0.354       0.25      0.191     0.0681
               bed        136        310      0.449      0.274      0.272      0.181      0.476      0.269      0.268      0.155
      dining table        411       1615      0.261      0.115     0.0966     0.0526      0.189     0.0774     0.0598     0.0244
            toilet        143        226      0.631      0.588      0.632      0.542      0.664      0.597      0.631      0.524
                tv        209        305      0.624      0.636      0.653      0.491      0.633       0.61      0.631      0.435
            laptop        181        288      0.599      0.535      0.558      0.464      0.629      0.521      0.548      0.324
             mouse         87        112      0.615      0.686      0.687      0.522      0.619       0.67      0.673      0.467
            remote        122        279      0.368      0.247       0.23      0.146       0.39      0.232      0.226      0.119
          keyboard        149        238      0.696      0.501      0.556      0.406      0.712      0.475      0.549      0.378
        cell phone        212        327      0.471      0.313      0.328      0.229      0.504      0.307      0.329      0.209
         microwave         55         58      0.512      0.569      0.611      0.507       0.54      0.552      0.611      0.465
              oven         97        149      0.558      0.499      0.504      0.345      0.592       0.49      0.499       0.31
           toaster         10         12          1      0.185      0.501      0.327          1      0.165      0.501      0.303
              sink        182        252      0.534      0.464      0.465      0.329      0.572      0.462      0.488      0.315
      refrigerator         95        157      0.646      0.523      0.548      0.433      0.658      0.503      0.533      0.396
              book        217       1082      0.413      0.166      0.201      0.106      0.337      0.107      0.123     0.0489
             clock        207        281      0.622      0.598      0.612      0.436      0.628       0.58      0.612      0.386
              vase        230        642      0.567       0.22      0.263      0.174      0.586      0.212      0.255       0.15
          scissors         29         43      0.425      0.256      0.252      0.201      0.466      0.256      0.252      0.129
        teddy bear         96        254      0.638      0.437      0.489      0.368      0.672       0.42      0.473       0.33
        hair drier          6         12          1          0     0.0339    0.00601          1          0     0.0362     0.0106
        toothbrush         32         67      0.429      0.209      0.161     0.0991      0.448      0.206      0.167     0.0828

yolov11-n (10.2 GFLOPs) (do not use for any production, AGPL)#

             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100%|██████████| 313/313 [00:23<00:00, 13.09
               all       5000      58597      0.594      0.392      0.424      0.309      0.601      0.376       0.41      0.265
            person       2771      19918       0.74      0.392      0.487       0.33      0.749       0.38      0.468       0.27
           bicycle        156        445       0.51      0.248      0.235      0.138      0.502      0.225       0.21     0.0858
               car        575       2825      0.636      0.401      0.451      0.284      0.643      0.382      0.428      0.227
        motorcycle        164        668      0.666      0.319      0.387      0.235      0.687      0.313      0.364       0.19
          airplane         98        186      0.762      0.613       0.69      0.557      0.784      0.608      0.679      0.439
               bus        190        447      0.727      0.423      0.478      0.381      0.758      0.425      0.475       0.36
             train        155        370      0.734      0.395      0.454      0.347      0.716      0.378      0.444       0.34
             truck        230        602      0.468      0.252       0.27      0.175      0.494      0.246      0.267      0.158
              boat        111        473      0.534      0.342      0.369       0.22      0.539      0.313      0.345      0.163
     traffic light        170        524      0.539      0.418      0.462      0.263      0.542      0.387      0.419      0.186
      fire hydrant         81         97      0.819      0.773      0.837      0.665      0.841      0.763      0.826      0.642
         stop sign         53         56      0.637       0.75      0.761      0.709      0.642       0.75      0.761      0.674
     parking meter         32         53      0.745       0.66      0.659      0.547      0.774       0.66       0.68      0.558
             bench        218        770      0.459      0.143      0.167      0.112      0.458      0.134      0.159     0.0792
              bird        132        587      0.647      0.356      0.443      0.288      0.661      0.334      0.415       0.23
               cat        185        242      0.755      0.707       0.75      0.635      0.775      0.707      0.756      0.632
               dog        184        274      0.744      0.602      0.645      0.553      0.756      0.602      0.647      0.524
             horse        130        526      0.668      0.338       0.37      0.237      0.664      0.331      0.351      0.187
             sheep         57        462      0.574      0.539      0.544      0.383      0.577      0.517      0.515      0.312
               cow         83        500      0.681      0.478      0.531      0.387      0.673      0.454      0.496      0.312
          elephant         90        488      0.802      0.512      0.582      0.441      0.815      0.514      0.582      0.395
              bear         53        158      0.812      0.367      0.415      0.357      0.829      0.361      0.418      0.335
             zebra         86        428      0.796      0.547      0.629      0.471      0.787      0.528      0.608      0.404
           giraffe        104        354      0.845      0.588      0.637      0.515      0.841      0.576      0.626      0.444
          backpack        261        533      0.442      0.154      0.149     0.0863      0.471      0.145      0.155      0.078
          umbrella        175        613      0.608      0.372      0.414      0.261      0.653      0.375      0.416      0.267
           handbag        334        729      0.458      0.125      0.148     0.0876      0.481      0.115      0.153     0.0774
               tie        142        290      0.606       0.36      0.379      0.256       0.61      0.328      0.354      0.221
          suitcase        102        409      0.553       0.34      0.398       0.26      0.575       0.33      0.385      0.242
           frisbee         84        131      0.733      0.664      0.701      0.559      0.745      0.672      0.707      0.468
              skis        114        459      0.261     0.0959     0.0949     0.0406      0.224     0.0741     0.0667     0.0175
         snowboard         45         74      0.515      0.405      0.408      0.295      0.545      0.378      0.401       0.21
       sports ball        152        221      0.678      0.507      0.524       0.38      0.669      0.489      0.496      0.256
              kite         80        257      0.488      0.665      0.585      0.415      0.474      0.607      0.538      0.266
      baseball bat        101        254      0.533      0.232      0.256      0.153      0.613      0.252      0.282      0.141
    baseball glove         97        146      0.609      0.541       0.55      0.351      0.622      0.541      0.555      0.325
        skateboard        127        226      0.653      0.549      0.557      0.384      0.645      0.513      0.544      0.261
         surfboard        147        409      0.523      0.306      0.316      0.198      0.566      0.311      0.346      0.168
     tennis racket        168        339      0.698      0.435      0.477      0.298      0.686      0.407      0.441      0.279
            bottle        414       1345      0.622      0.364      0.405      0.264       0.62      0.341      0.378      0.217
        wine glass        115        449      0.616      0.285       0.35      0.223      0.651      0.283      0.346       0.19
               cup        383        937        0.5      0.414       0.41        0.3      0.518      0.403      0.404      0.266
              fork        148        271      0.511       0.28      0.316      0.229      0.485      0.247      0.259      0.125
             knife        168        315      0.399      0.203      0.188      0.113      0.352       0.16      0.139      0.073
             spoon        143        306      0.332      0.154      0.136      0.083      0.313      0.134      0.115     0.0526
              bowl        269        659      0.467       0.41      0.372      0.276      0.388      0.319      0.249      0.143
            banana         97       1910      0.373     0.0508       0.15     0.0808      0.322     0.0393      0.113     0.0522
             apple         65        773      0.488       0.11      0.194      0.136       0.52      0.106      0.185      0.113
          sandwich         75        141      0.427      0.525      0.434      0.325      0.458      0.525      0.456      0.327
            orange         79        550      0.643      0.304      0.385      0.302      0.653      0.291       0.38      0.276
          broccoli         61        436       0.57      0.351      0.398      0.233      0.567      0.321      0.381      0.218
            carrot         68       1532      0.589      0.127      0.223      0.128      0.606       0.12      0.211      0.104
           hot dog         37        109      0.576      0.596      0.578      0.417      0.501      0.486       0.48      0.325
             pizza        142        312       0.67      0.631      0.661      0.543      0.682      0.612      0.651       0.51
             donut         50        499      0.653      0.403      0.496      0.391      0.678      0.391      0.489      0.355
              cake        123        438      0.655      0.345      0.458        0.3      0.677      0.333      0.455      0.297
             chair        601       3480      0.568      0.214      0.274      0.175      0.552      0.189      0.239      0.109
             couch        203        523      0.467       0.25      0.266      0.191      0.494      0.247      0.257       0.16
      potted plant        163        360      0.367      0.286      0.237      0.107      0.345      0.247        0.2     0.0761
               bed        136        310      0.551      0.294      0.313      0.214      0.554      0.284      0.302      0.186
      dining table        411       1615      0.292       0.12      0.101      0.054      0.219     0.0842     0.0594     0.0231
            toilet        143        226      0.714      0.615      0.669      0.572      0.713      0.597       0.66      0.545
                tv        209        305      0.696      0.639      0.682      0.517      0.703      0.623      0.659      0.461
            laptop        181        288      0.671      0.583      0.588      0.509      0.681      0.566      0.571      0.349
             mouse         87        112      0.617      0.652      0.684      0.537       0.63      0.643      0.678      0.492
            remote        122        279      0.433      0.276      0.274      0.177      0.441      0.262      0.266      0.147
          keyboard        149        238       0.69      0.496      0.581      0.421      0.681      0.471      0.563      0.393
        cell phone        212        327      0.567      0.349      0.375      0.267      0.574      0.324      0.369      0.237
         microwave         55         58      0.611      0.603       0.64      0.547      0.643      0.586      0.642      0.479
              oven         97        149      0.614      0.483      0.502      0.365      0.637       0.47      0.508      0.324
           toaster         10         12       0.28     0.0833      0.222      0.137      0.313     0.0833      0.222      0.151
              sink        182        252      0.544      0.472      0.486      0.349      0.566      0.456      0.486      0.321
      refrigerator         95        157       0.69      0.554      0.576      0.478      0.727      0.548      0.574      0.441
              book        217       1082      0.383      0.153      0.203      0.108      0.357      0.119      0.145     0.0594
             clock        207        281      0.692      0.601      0.619      0.453      0.712      0.594      0.619      0.406
              vase        230        642       0.57      0.221      0.269      0.181      0.579      0.208      0.264      0.156
          scissors         29         43        0.7      0.302      0.284      0.246      0.716      0.294      0.282      0.137
        teddy bear         96        254      0.726      0.461      0.537      0.404      0.718      0.445      0.518      0.374
        hair drier          6         12          1          0    0.00566    0.00199          1          0    0.00561    0.00224
        toothbrush         32         67      0.318      0.164      0.208      0.149      0.439      0.209      0.253       0.14

YOLOv9#

Implementation of paper - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information

arxiv.org
Hugging Face Spaces
Hugging Face Spaces
Colab
OpenCV

Performance#

MS COCO

ModelTest SizeAPvalAP50valAP75valParam.FLOPs
YOLOv9-T64038.3%53.1%41.3%2.0M7.7G
YOLOv9-S64046.8%63.4%50.7%7.1M26.4G
YOLOv9-M64051.4%68.1%56.1%20.0M76.3G
YOLOv9-C64053.0%70.2%57.8%25.3M102.1G
YOLOv9-E64055.6%72.8%60.6%57.3M189.0G
Expand

Custom training: https://github.com/WongKinYiu/yolov9/issues/30#issuecomment-1960955297

ONNX export: https://github.com/WongKinYiu/yolov9/issues/2#issuecomment-1960519506 https://github.com/WongKinYiu/yolov9/issues/40#issue-2150697688 https://github.com/WongKinYiu/yolov9/issues/130#issue-2162045461

ONNX export for segmentation: https://github.com/WongKinYiu/yolov9/issues/260#issue-2191162150

TensorRT inference: https://github.com/WongKinYiu/yolov9/issues/143#issuecomment-1975049660 https://github.com/WongKinYiu/yolov9/issues/34#issue-2150393690 https://github.com/WongKinYiu/yolov9/issues/79#issue-2153547004 https://github.com/WongKinYiu/yolov9/issues/143#issue-2164002309

QAT TensorRT: https://github.com/WongKinYiu/yolov9/issues/327#issue-2229284136 https://github.com/WongKinYiu/yolov9/issues/253#issue-2189520073

TensorRT inference for segmentation: https://github.com/WongKinYiu/yolov9/issues/446

TFLite: https://github.com/WongKinYiu/yolov9/issues/374#issuecomment-2065751706

OpenVINO: https://github.com/WongKinYiu/yolov9/issues/164#issue-2168540003

C# ONNX inference: https://github.com/WongKinYiu/yolov9/issues/95#issue-2155974619

C# OpenVINO inference: https://github.com/WongKinYiu/yolov9/issues/95#issuecomment-1968131244

OpenCV: https://github.com/WongKinYiu/yolov9/issues/113#issuecomment-1971327672

Hugging Face demo: https://github.com/WongKinYiu/yolov9/issues/45#issuecomment-1961496943

CoLab demo: https://github.com/WongKinYiu/yolov9/pull/18

ONNXSlim export: https://github.com/WongKinYiu/yolov9/pull/37

YOLOv9 ROS: https://github.com/WongKinYiu/yolov9/issues/144#issue-2164210644

YOLOv9 ROS TensorRT: https://github.com/WongKinYiu/yolov9/issues/145#issue-2164218595

YOLOv9 Julia: https://github.com/WongKinYiu/yolov9/issues/141#issuecomment-1973710107

YOLOv9 MLX: https://github.com/WongKinYiu/yolov9/issues/258#issue-2190586540

YOLOv9 StrongSORT with OSNet: https://github.com/WongKinYiu/yolov9/issues/299#issue-2212093340

YOLOv9 ByteTrack: https://github.com/WongKinYiu/yolov9/issues/78#issue-2153512879

YOLOv9 DeepSORT: https://github.com/WongKinYiu/yolov9/issues/98#issue-2156172319

YOLOv9 counting: https://github.com/WongKinYiu/yolov9/issues/84#issue-2153904804

YOLOv9 speed estimation: https://github.com/WongKinYiu/yolov9/issues/456

YOLOv9 face detection: https://github.com/WongKinYiu/yolov9/issues/121#issue-2160218766

YOLOv9 segmentation onnxruntime: https://github.com/WongKinYiu/yolov9/issues/151#issue-2165667350

Comet logging: https://github.com/WongKinYiu/yolov9/pull/110

MLflow logging: https://github.com/WongKinYiu/yolov9/pull/87

AnyLabeling tool: https://github.com/WongKinYiu/yolov9/issues/48#issue-2152139662

AX650N deploy: https://github.com/WongKinYiu/yolov9/issues/96#issue-2156115760

Conda environment: https://github.com/WongKinYiu/yolov9/pull/93

AutoDL docker environment: https://github.com/WongKinYiu/yolov9/issues/112#issue-2158203480

Installation#

Docker environment (recommended)

Expand
# create the docker container, you can change the share memory size if you have more.
nvidia-docker run --name yolov9 -it -v your_coco_path/:/coco/ -v your_code_path/:/yolov9 --shm-size=64g nvcr.io/nvidia/pytorch:21.11-py3

# apt install required packages
apt update
apt install -y zip htop screen libgl1-mesa-glx

# pip install required packages
pip install seaborn thop

# go to code folder
cd /yolov9

Evaluation#

yolov9-s-converted.pt yolov9-m-converted.pt yolov9-c-converted.pt yolov9-e-converted.pt
yolov9-s.pt yolov9-m.pt yolov9-c.pt yolov9-e.pt
gelan-s.pt gelan-m.pt gelan-c.pt gelan-e.pt

# evaluate converted yolov9 models
python val.py --data data/coco.yaml --img 640 --batch 32 --conf 0.001 --iou 0.7 --device 0 --weights './yolov9-c-converted.pt' --save-json --name yolov9_c_c_640_val

# evaluate yolov9 models
# python val_dual.py --data data/coco.yaml --img 640 --batch 32 --conf 0.001 --iou 0.7 --device 0 --weights './yolov9-c.pt' --save-json --name yolov9_c_640_val

# evaluate gelan models
# python val.py --data data/coco.yaml --img 640 --batch 32 --conf 0.001 --iou 0.7 --device 0 --weights './gelan-c.pt' --save-json --name gelan_c_640_val

You will get the results:

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.530
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.702
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.578
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.362
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.585
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.693
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.392
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.652
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.702
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.541
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.760
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.844

Training#

Data preparation

bash scripts/get_coco.sh
  • Download MS COCO dataset images (train, val, test) and labels. If you have previously used a different version of YOLO, we strongly recommend that you delete train2017.cache and val2017.cache files, and redownload labels

Single GPU training

# train yolov9 models
python train_dual.py --workers 8 --device 0 --batch 16 --data data/coco.yaml --img 640 --cfg models/detect/yolov9-c.yaml --weights '' --name yolov9-c --hyp hyp.scratch-high.yaml --min-items 0 --epochs 500 --close-mosaic 15

# train gelan models
# python train.py --workers 8 --device 0 --batch 32 --data data/coco.yaml --img 640 --cfg models/detect/gelan-c.yaml --weights '' --name gelan-c --hyp hyp.scratch-high.yaml --min-items 0 --epochs 500 --close-mosaic 15

Multiple GPU training

# train yolov9 models
python -m torch.distributed.launch --nproc_per_node 8 --master_port 9527 train_dual.py --workers 8 --device 0,1,2,3,4,5,6,7 --sync-bn --batch 128 --data data/coco.yaml --img 640 --cfg models/detect/yolov9-c.yaml --weights '' --name yolov9-c --hyp hyp.scratch-high.yaml --min-items 0 --epochs 500 --close-mosaic 15

# train gelan models
# python -m torch.distributed.launch --nproc_per_node 4 --master_port 9527 train.py --workers 8 --device 0,1,2,3 --sync-bn --batch 128 --data data/coco.yaml --img 640 --cfg models/detect/gelan-c.yaml --weights '' --name gelan-c --hyp hyp.scratch-high.yaml --min-items 0 --epochs 500 --close-mosaic 15

Re-parameterization#

See reparameterization.ipynb.

Inference#

# inference converted yolov9 models
python detect.py --source './data/images/horses.jpg' --img 640 --device 0 --weights './yolov9-c-converted.pt' --name yolov9_c_c_640_detect

# inference yolov9 models
# python detect_dual.py --source './data/images/horses.jpg' --img 640 --device 0 --weights './yolov9-c.pt' --name yolov9_c_640_detect

# inference gelan models
# python detect.py --source './data/images/horses.jpg' --img 640 --device 0 --weights './gelan-c.pt' --name gelan_c_c_640_detect

Citation#

@article{wang2024yolov9,
  title={{YOLOv9}: Learning What You Want to Learn Using Programmable Gradient Information},
  author={Wang, Chien-Yao  and Liao, Hong-Yuan Mark},
  booktitle={arXiv preprint arXiv:2402.13616},
  year={2024}
}
@article{chang2023yolor,
  title={{YOLOR}-Based Multi-Task Learning},
  author={Chang, Hung-Shuo and Wang, Chien-Yao and Wang, Richard Robert and Chou, Gene and Liao, Hong-Yuan Mark},
  journal={arXiv preprint arXiv:2309.16921},
  year={2023}
}

Teaser#

Parts of code of YOLOR-Based Multi-Task Learning are released in the repository.

Object Detection#

gelan-c-det.pt

object detection

# coco/labels/{split}/*.txt
# bbox or polygon (1 instance 1 line)
python train.py --workers 8 --device 0 --batch 32 --data data/coco.yaml --img 640 --cfg models/detect/gelan-c.yaml --weights '' --name gelan-c-det --hyp hyp.scratch-high.yaml --min-items 0 --epochs 300 --close-mosaic 10
ModelTest SizeParam.FLOPsAPbox
GELAN-C-DET64025.3M102.1G52.3%
YOLOv9-C-DET64025.3M102.1G53.0%

Instance Segmentation#

gelan-c-seg.pt

object detection instance segmentation

# coco/labels/{split}/*.txt
# polygon (1 instance 1 line)
python segment/train.py --workers 8 --device 0 --batch 32  --data coco.yaml --img 640 --cfg models/segment/gelan-c-seg.yaml --weights '' --name gelan-c-seg --hyp hyp.scratch-high.yaml --no-overlap --epochs 300 --close-mosaic 10
ModelTest SizeParam.FLOPsAPboxAPmask
GELAN-C-SEG64027.4M144.6G52.3%42.4%
YOLOv9-C-SEG64027.4M145.5G53.3%43.5%

Panoptic Segmentation#

gelan-c-pan.pt

object detection instance segmentation semantic segmentation stuff segmentation panoptic segmentation

# coco/labels/{split}/*.txt
# polygon (1 instance 1 line)
# coco/stuff/{split}/*.txt
# polygon (1 semantic 1 line)
python panoptic/train.py --workers 8 --device 0 --batch 32  --data coco.yaml --img 640 --cfg models/panoptic/gelan-c-pan.yaml --weights '' --name gelan-c-pan --hyp hyp.scratch-high.yaml --no-overlap --epochs 300 --close-mosaic 10
ModelTest SizeParam.FLOPsAPboxAPmaskmIoU164k/10ksemanticmIoUstuffPQpanoptic
GELAN-C-PAN64027.6M146.7G52.6%42.5%39.0%/48.3%52.7%39.4%
YOLOv9-C-PAN64028.8M187.0G52.7%43.0%39.8%/-52.2%40.5%

Image Captioning (not yet released)#

object detection instance segmentation semantic segmentation stuff segmentation panoptic segmentation image captioning

# coco/labels/{split}/*.txt
# polygon (1 instance 1 line)
# coco/stuff/{split}/*.txt
# polygon (1 semantic 1 line)
# coco/annotations/*.json
# json (1 split 1 file)
python caption/train.py --workers 8 --device 0 --batch 32  --data coco.yaml --img 640 --cfg models/caption/gelan-c-cap.yaml --weights '' --name gelan-c-cap --hyp hyp.scratch-high.yaml --no-overlap --epochs 300 --close-mosaic 10
ModelTest SizeParam.FLOPsAPboxAPmaskmIoU164k/10ksemanticmIoUstuffPQpanopticBLEU@4captionCIDErcaption
GELAN-C-CAP64047.5M-51.9%42.6%42.5%/-56.5%41.7%38.8122.3
YOLOv9-C-CAP64047.5M-52.1%42.6%43.0%/-56.4%42.1%39.1122.0

Acknowledgements#

Expand