Mean Gene Hacks :RNN Language Model outperforms GPT and other transformers!
A quick look at an interesting new Recurrent Neural Network (RNN) based language model that can compete with many transformer based models. This attention-free model is not only innovative, but it also delivers performance on par and sometimes even surpassing that of similar-sized transformer-based models.
* Github Repo for RWKV:
https://github.com/BlinkDL/RWKV-LM
* Github Repo for the Gradio UI + 14B pretrained LM:
https://github.com/gururise/rwkv_gradio
* Online Web Demo of smaller & faster (instruct trained) 1.5B parameter model:
https://huggingface.co/spaces/yahma/rwkv-instruct
* Online Web Demo of 7b parameter model (Instruct trained):
https://huggingface.co/spaces/Hazzzardous/RWKV-Instruct
* Online Web Demo of 14b parameter model:
https://huggingface.co/spaces/yahma/rwkv-14b
* Prompt used for the response in the video:
Here is a short story (in the style of Arthur C. Clarke) in which a first-generation female artificial intelligence looks to make friends with a human:
AI generated voice provided by 11.ai
AI generated video provided by d-id.com
#ai #nlp #shorts #RNN #NLP #LanguageModels #gpt3