Nous hermes 13b reddit I like Nous-Hermes-Llama2-13B, but after so long it starts outputting sentences which lack prepositions. Just having a greeting message isn't enough to get it to copy the style, ideally your character card should include examples and your own first message should also look like what you want to get back. Not necessarily all 30B fine-tunes are better than every 13B fine-tune. g. However once the exchange of conversation between Nous Hermes gets past a few messages - the Nous Hermes completely forgets things and responds as if having no awareness of its previous content. cpp based Space! Jun 4, 2023 ยท I'm finding it to be as good or better than Vicuna/Wizard Vicuna/Wizard-uncensored models in almost every case. It tops most of the 13b models in most benchmarks I've seen it in (here's a compilation of llm benchmarks by u/YearZero). I have your same setup (64+12) but I'd rather stay with 13B using the vram as much as possible. We are planning on releasing different Interesting results, thanks for sharing! I used qlora for 1. 2b, Nous-Hermes-Llama2-70B 13B: Mythalion-13B But MXLewd-L2-20B is fascinating me a lot despite the technical issues I'm having with it. It maybe helps it's prose a little, but it gives the base model a serious downgrade in IQ that isn't worth the squeeze. In my experiences with 13b editions of Hermes and Puffin, Puffin was basically useless to me and never generated good output, while Hermes-Llama2-13B is the best overall 13b model I have used. use the following search parameters to narrow your results: subreddit:subreddit find submissions in "subreddit" author:username find submissions by "username" site:example. Releasing Hermes-LLongMA-2 8k, a series of Llama-2 models, trained at 8k context length using linear positional interpolation scaling. ) My entire list at: Local LLM Comparison Repo I'm running ooba Text Gen Ui as backend for Nous-Hermes-13b 4bit GPTQ version, with new exllama and exllama-hf, it's real fast on my local 3060. 70B: Xwin-LM-70B-V0. Seems like Llama 2 is better overall but it really depends on what you want and what you can run. . It's very weird. I'm not even sure how they managed to make it that dumb. 7b capybara was solid AF. Nous-Hermes-13b is a state-of-the-art language model fine-tuned on over 300,000 instructions. Nous-Hermes-Llama2. compress_pos_emb = 2. 7~11. It doesn't get talked about very much in this subreddit so I wanted to bring some more attention to Nous Hermes. For those of you haven't tried it, do -- its worth it. And many of these are 13B models that should work well with lower VRAM count GPUs! I recommend trying to load with Exllama (HF if possible). While testing it, I took notes and here's my verdict: "More storytelling than chatting, sometimes speech inside actions, not as smart as Nous Hermes Llama2, didn't follow After testing so many models, I think "general intelligence" is a - or maybe "the" - key to success: The smarter a model is, the less it seems to suffer from the repetition issue. I've got a feeling I wouldn't notice the censorship so it's worth checking this one out I suppose. com This is pretty good. Reply reply My take with Nous Hermes 13b 4bit (I believe Mythomax will comply, too): User: Please write a novel chapter based on given directions. That's why the 70Bs suffer less, and Nous Hermes v2 being one of the top-ranked 13B models has worked well for me, too. Looking forward to seeing how the big brother does. I occasionally use Nous-Hermes-13b or Manticore-13b-chat-pyg. Developers now have a versatile tool at their disposal, primed for crafting a myriad of ingenious automations. This is version 2 of Nous Research's line of Hermes models, and Nous Hermes 2 builds on the Open Hermes 2. Thanks to our most esteemed model trainer, Mr TheBloke, we now have versions of Manticore, Nous Hermes (!!), WizardLM and so on, all with SuperHOT 8k context LoRA. Nous Hermes L2 13B-4 bits, has me really surprised, been using it for many days and now is my clear favorite. 5-16K (16K context instead of the usual 4K enables more complex character setups and much longer stories) 70B models would most likely be even better, but my system doesn't let me run them with acceptable speed for realtime chat, so best for me are currently these 13Bs. I just tried doing a scene using nous-hermes-2-solar-10. Custom Dataset Enriched with Function Calling: Our model's training data includes a unique feature – function calling. If anyone figures out a way to get it to stop talking or acting on behalf of my character, that'd be a plus. The Hermes 2 model was trained on 900,000 instructions, and surpasses all previous versions of Hermes 13B and below, and matches 70B on some benchmarks! Hermes 2 changes the game with strong multiturn chat skills, system prompt capabilities, and uses ChatML format. 1% of Hermes-2 avarage score. 1, Synthia-70B-v1. But sometimes I'd problem made those creative model (Nous-Hermes,chronos, airoboros) follow instruction, those one speaks and acts as me. Background: In a remote town, there is an old mine. I double-checked to make sure my context/instruct settings were right, textgen settings too, and yet despite everything being ok I could barely get a few posts into the roleplay before dolphin, airoboros and nous-hermes have no explicit censorship — airoboros is currently the best 70b Llama 2 model, as other ones are still in training. Unfortunately, while this model does write quite well, it still only takes me about 20 or so messages before it starts showing the same "catch phrase" behavior as the dozen or so other LLaMA 2 models I've tried. 0 - Nous-Hermes-13B - Selfee-13B-GPTQ (This one is interesting, it will revise its own response. So I was excited to try the new Chronos Hermes. Using this settings, no OOM on load or during use and context sizes reaches up to 3254~ and hovers around that value with max_new_token set to 800. GPT4-x-Vicuna-13b-4bit does not seem to have such problem and its responses feel better. I just uploaded the Puffin benchmarks and I can confirm Puffin beats Hermes-2 for the #1 spot in even popular single-turn benchmarks like Arc-E, Winogrande, Hellaswag, and ties Hermes-2 in PIQA. It seems perhaps the qlora claims of being within ~1% or so of full fine tune aren't quite proving out, or I've done something horribly wrong. Solar hermes is generally the worst mainstream solar finetune I know of. But it takes a longer time to arrive at a final response. New unfiltered 13B: OpenAccess AI Collective's Wizard Mega 13B. >Nous-Hermes-13B max_seq_len = 4096. This distinctive addition transforms Nous-Hermes-2-Vision into a Vision-Language Action Model. One day, a couple of mining crew were lost in the maze-like passageways and could not find the way out. 1 (for airoboros 7b and 13b). Sometimes even common, short verb conjugations go missing (am, are, etc. Get the Reddit app Scan this QR code to download the app now It's a merge between a custom model and the Nous Hermes 13b. The models were trained in collaboration with Teknium1 and u/emozilla of NousResearch, and u/kaiokendev. VRAM usage sits around 11. q5_K_M version of Nous Hermes 13b because I was curious if the lower perplexity would make a difference: /r/StableDiffusion is back open after the protest of My top three are (Note: my rig can only run 13B/7B): - wizardLM-13B-1. [ Also: any plans for 30B? would be great to see if 30B improves the results. It replaced my previous favorites that were L1 Wizard Vicuna 13B and L2 Airoboros 13B. ). It's quality, diversity and scale is unmatched in the current OS LM landscape. Though most of the time, the first response is good enough. Includes a llama. In LLaMA (1) times, I preferred Guanaco 33B over Chronos Hermes, but in Llama 2 land, my favorite is Nous Hermes Llama2. 5 dataset, surpassing all Open Hermes and Nous Hermes models of the past, trained over Yi 34B with others to come! We achieve incredible benchmarks and surpass all of the previous Open Hermes and Nous Hermes in all scores. This one seems to have much less of an issue with It feels like Chronos Hermes 13B is more of a novel writer than a chat writer. Nous Hermes 13b is very good. I wasn't a big fan of Nous Hermes in the past: I found it had major issues with speaking for me or screwing up by including meta-text or switching RP prose styles mid conversation. I've been testing it against my current favorite (FlatOrcaMaid 13B) it's performing very well. Nous-Hermes-Llama2 (very smart and good storytelling) vicuna-13B-v1. 2, full fine-tune with 1. 8 GB with other apps such as steam, 20 or so chrome tabs with a twitch stream in the background. They aren't explicitly trained on NSFW content, so if you want that, it needs to be in the foundational model. Having a 20B that's faster than the 70Bs and better than the 13Bs would be very welcome. Nous is very hit a miss with their datasets at times. Thanks for training/sharing this @NousResearch. This model was fine-tuned by Nous Research, with Teknium and Karan4D leading the fine tuning process and dataset curation, Redmond AI sponsoring the compute, and several other contributors. 7b and found that it quickly devolved into the bot endlessly repeating itself regardless of settings. It also reaches within 0. Some newer 13B models seem to be better than older 30Bs. I find the former to be quite decent but sometimes I notice that it traps itself in a loop by repeating the same scene all over again, while the latter seems to be more prone with messing up details. I've noticed that MythoMax-L2-13B needs more guidance to use actions/emotes than e. ivmkuvbchvfrchqkiqfpncuupueewzekqnahlcvzdmxtrfhpaaqqn