Audio-Visual-Video-Caption:视听融合视频字幕模型的Pytorch实现-源码

上传者: 42128270 | 上传时间: 2021-07-05 19:33:19 | 文件大小: 99.52MB | 文件类型: ZIP
视听视频字幕 这是我通过使用pytorch框架使用MSR-VTT数据集为视频字幕构建的项目,该框架涉及视觉和音频信息。 视频的视觉内容被预处理成固定数量的帧,馈入经过预训练的深度CNN(例如,ResNet 152)以提取特征,并馈入LSTM编码器。 对于音频内容,它们被预处理为MFCC,并馈入另一个LSTM编码器。 然后,将两个LSTM编码器的输出和隐藏状态通过平均池化(或多级注意,以及子总和单元 )进行组合,然后进一步馈入LSTM解码器以生成字幕。 整个项目的基本结构是从导入的。 要运行该项目,您需要以下依赖项: Python3 运行模型的步骤 第一步是预处理视频和字幕 $ python preprocess.py --video_dir path/to/the/training/video/directory --output_dir path/to/the/features/

文件下载

资源详情

[{"title":"( 71 个子文件 99.52MB ) Audio-Visual-Video-Caption:视听融合视频字幕模型的Pytorch实现-源码","children":[{"title":"Audio-Visual-Video-Caption-master","children":[{"title":"validate.py <span style='color:#111;'> 4.32KB </span>","children":null,"spread":false},{"title":"NLUtils.py <span style='color:#111;'> 1.19KB </span>","children":null,"spread":false},{"title":"dataloader.py <span style='color:#111;'> 2.86KB </span>","children":null,"spread":false},{"title":"eval.py <span style='color:#111;'> 4.25KB </span>","children":null,"spread":false},{"title":"opts.py <span style='color:#111;'> 4.07KB </span>","children":null,"spread":false},{"title":"models","children":[{"title":"MultimodalAtt.py <span style='color:#111;'> 6.99KB </span>","children":null,"spread":false},{"title":"__init__.py <span style='color:#111;'> 40B </span>","children":null,"spread":false},{"title":"ChildSum.py <span style='color:#111;'> 933B </span>","children":null,"spread":false},{"title":"Attention.py <span style='color:#111;'> 994B </span>","children":null,"spread":false}],"spread":true},{"title":"pycocoevalcap","children":[{"title":"meteor","children":[{"title":"data","children":[{"title":"paraphrase-en.gz <span style='color:#111;'> 58.95MB </span>","children":null,"spread":false}],"spread":true},{"title":"__init__.py <span style='color:#111;'> 21B </span>","children":null,"spread":false},{"title":"__pycache__","children":[{"title":"meteor.cpython-36.pyc <span style='color:#111;'> 2.56KB </span>","children":null,"spread":false},{"title":"__init__.cpython-36.pyc <span style='color:#111;'> 184B </span>","children":null,"spread":false}],"spread":true},{"title":"hs_err_pid28996.log <span style='color:#111;'> 104.31KB </span>","children":null,"spread":false},{"title":"meteor.py <span style='color:#111;'> 3.20KB </span>","children":null,"spread":false},{"title":"meteor-1.5.jar <span style='color:#111;'> 6.03MB </span>","children":null,"spread":false}],"spread":true},{"title":"__init__.py <span style='color:#111;'> 21B </span>","children":null,"spread":false},{"title":"eval.py <span style='color:#111;'> 2.65KB </span>","children":null,"spread":false},{"title":"tokenizer","children":[{"title":"__init__.py <span style='color:#111;'> 21B </span>","children":null,"spread":false},{"title":"ptbtokenizer.py <span style='color:#111;'> 2.76KB </span>","children":null,"spread":false},{"title":"__pycache__","children":[{"title":"__init__.cpython-36.pyc <span style='color:#111;'> 187B </span>","children":null,"spread":false},{"title":"ptbtokenizer.cpython-36.pyc <span style='color:#111;'> 1.99KB </span>","children":null,"spread":false}],"spread":false},{"title":"tmpmnu9e6np <span style='color:#111;'> 0B </span>","children":null,"spread":false},{"title":"stanford-corenlp-3.4.1.jar <span style='color:#111;'> 5.65MB </span>","children":null,"spread":false},{"title":"tmp0sg2eg45 <span style='color:#111;'> 0B </span>","children":null,"spread":false}],"spread":true},{"title":"bleu","children":[{"title":"__init__.py <span style='color:#111;'> 21B </span>","children":null,"spread":false},{"title":"LICENSE <span style='color:#111;'> 1.08KB </span>","children":null,"spread":false},{"title":"bleu_scorer.py <span style='color:#111;'> 8.49KB </span>","children":null,"spread":false},{"title":"__pycache__","children":[{"title":"bleu.cpython-36.pyc <span style='color:#111;'> 1.10KB </span>","children":null,"spread":false},{"title":"bleu_scorer.cpython-36.pyc <span style='color:#111;'> 7.68KB </span>","children":null,"spread":false},{"title":"__init__.cpython-36.pyc <span style='color:#111;'> 182B </span>","children":null,"spread":false}],"spread":false},{"title":"bleu.py <span style='color:#111;'> 1.24KB </span>","children":null,"spread":false}],"spread":true},{"title":"__pycache__","children":[{"title":"__init__.cpython-36.pyc <span style='color:#111;'> 177B </span>","children":null,"spread":false}],"spread":true},{"title":"rouge","children":[{"title":"__init__.py <span style='color:#111;'> 23B </span>","children":null,"spread":false},{"title":"rouge.py <span style='color:#111;'> 3.58KB </span>","children":null,"spread":false},{"title":"__pycache__","children":[{"title":"__init__.cpython-36.pyc <span style='color:#111;'> 185B </span>","children":null,"spread":false},{"title":"rouge.cpython-36.pyc <span style='color:#111;'> 3.58KB </span>","children":null,"spread":false}],"spread":false}],"spread":false},{"title":"spice","children":[{"title":"__init__.py <span style='color:#111;'> 0B </span>","children":null,"spread":false},{"title":"spice.py <span style='color:#111;'> 2.79KB </span>","children":null,"spread":false},{"title":"spice-1.0.jar <span style='color:#111;'> 18.84MB </span>","children":null,"spread":false},{"title":"lib","children":[{"title":"jackson-core-2.5.3.jar <span style='color:#111;'> 224.61KB </span>","children":null,"spread":false},{"title":"json-simple-1.1.1.jar <span style='color:#111;'> 23.37KB </span>","children":null,"spread":false},{"title":"slf4j-simple-1.7.21.jar <span style='color:#111;'> 10.65KB </span>","children":null,"spread":false},{"title":"objenesis-2.4.jar <span style='color:#111;'> 50.08KB </span>","children":null,"spread":false},{"title":"lmdbjni-win64-0.4.6.jar <span style='color:#111;'> 70.97KB </span>","children":null,"spread":false},{"title":"javassist-3.19.0-GA.jar <span style='color:#111;'> 731.93KB </span>","children":null,"spread":false},{"title":"Meteor-1.5.jar <span style='color:#111;'> 6.03MB </span>","children":null,"spread":false},{"title":"junit-4.12.jar <span style='color:#111;'> 307.55KB </span>","children":null,"spread":false},{"title":"ejml-0.23.jar <span style='color:#111;'> 294.44KB </span>","children":null,"spread":false},{"title":"slf4j-api-1.7.12.jar <span style='color:#111;'> 31.37KB </span>","children":null,"spread":false},{"title":"hamcrest-core-1.3.jar <span style='color:#111;'> 43.97KB </span>","children":null,"spread":false},{"title":"guava-19.0.jar <span style='color:#111;'> 2.20MB </span>","children":null,"spread":false},{"title":"lmdbjni-osx64-0.4.6.jar <span style='color:#111;'> 103.81KB </span>","children":null,"spread":false},{"title":"lmdbjni-0.4.6.jar <span style='color:#111;'> 83.96KB </span>","children":null,"spread":false},{"title":"fst-2.47.jar <span style='color:#111;'> 371.96KB </span>","children":null,"spread":false},{"title":"lmdbjni-linux64-0.4.6.jar <span style='color:#111;'> 376.91KB </span>","children":null,"spread":false},{"title":"SceneGraphParser-1.0.jar <span style='color:#111;'> 160.19KB </span>","children":null,"spread":false}],"spread":false}],"spread":false},{"title":"cider","children":[{"title":"__init__.py <span style='color:#111;'> 21B </span>","children":null,"spread":false},{"title":"cider.py <span style='color:#111;'> 1.66KB </span>","children":null,"spread":false},{"title":"__pycache__","children":[{"title":"cider.cpython-36.pyc <span style='color:#111;'> 1.53KB </span>","children":null,"spread":false},{"title":"__init__.cpython-36.pyc <span style='color:#111;'> 183B </span>","children":null,"spread":false},{"title":"cider_scorer.cpython-36.pyc <span style='color:#111;'> 7.48KB </span>","children":null,"spread":false}],"spread":false},{"title":"cider_scorer.py <span style='color:#111;'> 7.50KB </span>","children":null,"spread":false}],"spread":false}],"spread":true},{"title":"preprocess_vocab.py <span style='color:#111;'> 3.70KB </span>","children":null,"spread":false},{"title":"cocoeval.py <span style='color:#111;'> 6.89KB </span>","children":null,"spread":false},{"title":"eval_single.py <span style='color:#111;'> 4.44KB </span>","children":null,"spread":false},{"title":"reprocess_audio.py <span style='color:#111;'> 6.83KB </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 4.80KB </span>","children":null,"spread":false},{"title":"preprocess.py <span style='color:#111;'> 7.58KB </span>","children":null,"spread":false},{"title":".vscode","children":[{"title":"settings.json <span style='color:#111;'> 121B </span>","children":null,"spread":false}],"spread":false},{"title":"train.py <span style='color:#111;'> 3.01KB </span>","children":null,"spread":false}],"spread":false}],"spread":true}]

评论信息

免责申明

【只为小站】的资源来自网友分享,仅供学习研究,请务必在下载后24小时内给予删除,不得用于其他任何用途,否则后果自负。基于互联网的特殊性,【只为小站】 无法对用户传输的作品、信息、内容的权属或合法性、合规性、真实性、科学性、完整权、有效性等进行实质审查;无论 【只为小站】 经营者是否已进行审查,用户均应自行承担因其传输的作品、信息、内容而可能或已经产生的侵权或权属纠纷等法律责任。
本站所有资源不代表本站的观点或立场,基于网友分享,根据中国法律《信息网络传播权保护条例》第二十二条之规定,若资源存在侵权或相关问题请联系本站客服人员,zhiweidada#qq.com,请把#换成@,本站将给予最大的支持与配合,做到及时反馈和处理。关于更多版权及免责申明参见 版权及免责申明