Nous hermes 13b reddit. For 7b and 13b I definitely prefer it.

Nous hermes 13b reddit Yes Exllama is much faster but the speed is ok with llama. mythomax-l2-13b. There's a free Chatgpt bot, Open Assistant bot (Open-source model), AI image generator bot, Perplexity AI bot, 🤖 GPT-4 bot (Now with Visual capabilities (cloud vision)!) and channel for latest prompts! Mythomax, Mythalion, Nous-Hermes, Xwin, Mistral, Athena, Llama2 (base model), just off the top of my head. This is the results of my initial testing. NUC 13 Pro With eSXi question . My favorite so far is Nous Hermes LLama 2 13B*. I'm not even sure how they managed to make it that dumb. At IQ3_S or Q3_K_S, it can run on a Thank you so much for this tip. To try other quantization levels, please try the other tags. 2? What are the differences and optimizations Nous is doing on top of the base model? I'm also curious if there are any rankings/ evals for writing style / creative writing? I always have to go through a bunch of random posts trying to figure out what people are using. g. All of them 13b, Q6_K, contrastive search preset. cpp: loading model from models\TheBloke_Nous-Hermes-Llama2-GGML\nous-hermes-llama2-13b. Nyande_Stunna-Maid-7B-v0. Nous Hermes was trained on 300k gpt4 instructions and we definitely have more than a million gpt3. The prompt I used was the same every time; "Write me a 20 word poem about fire" Here are my results. Or check it out in the app stores     TOPICS. Sometimes even common, short verb conjugations go missing (am, are, etc. 6B, DiscoLM German 7B, Mixtral 2x7B, Beyonder Everything Hermes failed, ChatGPT failed just as much. Both ethernet ports work fine upon boot, but when VMware loads, only one has a link. Let me lay out the current landscape for you: role-playing: Mythomax, chronos-Hermes, or Kimiko. Summary. You could also try some of the 2x7b merges, such as Blue-Orchid-2x7b or DareBeagel-2x7b. Until the 8K Hermes is released, I think this is the best it gets for an instant, no-fine-tuning chatbot. 7b capybara was solid AF. This is what happens when you feed one of your neural networks Reddit, and another with textbooks. 7B is one of the models I was testing, I was testing it using . Dolphin-2. Get the Reddit app Scan this QR code to download the app now dolphin, airoboros and nous-hermes have no explicit censorship — airoboros is currently the best 70b Llama 2 model, as other ones are still in training. The model card lists the two experts as bagel-dpo-34b-v0. Works on my laptop with 8GB RAM. practicalzfs. q5_K_M openorca-platypus2-13b. 1% of Hermes I installed Nous-Hermes-13B-GGML & WizardLM-30B-GGML using the instructions in this reddit post. Just a few days ago I started my adventure with LLM, but I really enjoy the TheBloke/HornyEchidna-13B-v0. bin And of course, you can probably tell from the prompt, I'm using the nous-hermes-llama2-13b q5_K_M model. Less inclined to soppy low brow writing than llama2 or GPT turbo/sage, and seems to mostly grok ERP scenarios. It's very weird. Nous Hermes 13b is very good. Though most of the time, the first response is good enough. com with the ZFS community as well. But nicely descriptive! This model (13B version) works better for me than Nous-Hermes-Llama2-GPTQ, which can handle the long prompts of a complex card (mongirl, 2851 tokens with all example chats) in 4 out of 5 try. 0 13b wizard uncensored llama 2 13b Nous-hermes llama 1 13b (slept on abilities with right prompts) Wizardcoder-guanaco-15b upstage/llama-2048 instruct (strongest llama 1 model, except for coding, it is close to many 70b models - LLaMA2-13B-Tiefighter and MythoMax-L2-13b for when you need some VRAM for other stuff. I would start with Nous-Hermes-13B for uncensored, and wizard-vicuna-13B or wizardLM-13B-1. For immediate help and problem solving, please join us at https://discourse. Has anyone run into this before? Nous-Hermes 13b on GPT4All?. VRAM usage sits around 11. Since the beginning of the year, I've built multiple custom nodes for ComfyUI, translated scripts from PowerShell to Python, and started to build a text parser for document and web page analysis. 7B is an 11B model!!! I just released the NeuralHermes-2. 2b New koboldcpp. I can only has same success with chronos-hermes-13B-GPTQ_64g. I have not touched a 13B in a very long time because 7B Mistral finetunes are typically better. It replaced my previous favorites that 24 votes, 28 comments. 5 13B is able to more deeply understand your 24Kb+ (8K tokens) prompt file of corpus/FAQ/whatever compared to the 7B model 8K release, and it is phenomenal at answering questions on the material you provide it. ) available to compare side by side. cpp's "quantize" command to be helpful. 2, Chronos Hermes When I've tested Nous Hermes, I seen it switches from Russian to English right on the middle of the word. 6B to 120B: StableLM 2 Zephyr 1. This is a group for Eye Tracking enthusiasts who want to learn more about the technology, also for the occasional reddit browser who wants a glimpse into what is possible when this technology The Hermes 2 model was trained on 900,000 instructions, and surpasses all previous versions of Hermes 13B and below, and matches 70B on some benchmarks!Hermes 2 changes the game with strong multiturn chat skills, system prompt capabilities, and uses ChatML format. Get the Reddit app Scan this QR code to download the app now New Model Comparison/Test (Part 2 of 2: 7 models tested, 70B+180B) Winners: Nous-Hermes-Llama2-70B, Synthia-70B-v1. 1 8B, 70B New Model Comparison/Test (Part 1 of 2: 15 models tested, 13B+34B) Winner: Mythalion-13B New Model RP Comparison/Test (7 models tested) Winners: MythoMax-L2-13B, vicuna-13B-v1. A few other good ones are kimiko-mistral-7b, Wizard-Vicuna-13-Uncensored, nous-hermes-llama2-13b and vicuna-13b-v1. Nous-Hermes-13b is a state-of-the-art language model fine-tuned on over 300,000 instructions. They're both solid 13b models that still perform well and are really fast. It’s my personal favorite and even pretty good at moral dilemmas and long context programming convos. gguf basically for yolo reasons and it quickly became I side loaded this into GPT4All and love it to bits. Welcome to reddit's home for discussion of the Canon EF, EF-S, EF-M, and I have your same setup (64+12) but I'd rather stay with 13B using the vram as much as possible. q4_1. I double-checked to make sure my context/instruct settings were right, textgen settings too, and yet despite everything being ok I could barely get a few posts into the roleplay before I've made a playground with a bunch of the top 13B models (OpenOrca, Airoboros, Nous-Hermes, Vicnua etc. 0 Posted by u/themanyquestionsman - 1 vote and no comments Using a 3060 (12GB VRAM) >Nous-Hermes-13B max_seq_len = 4096. This is better for building commercial apps where the AI may need to handle sensitive data or do authorized tasks that you would expect from an employee behind a desk. GPT4-x-Vicuna-13b - A Vicuna model built on GPT-4. It provides a good balance between speed and instruction following. ggmlv3. gguf for physics research, questions about society and philosophy, "slow" RAG inference, and translating between English and German, part2, it looks like my previous message reached lenght limit Old gen. The model, enhanced from the Llama 13b, is I just uploaded the Puffin benchmarks and I can confirm Puffin beats Hermes-2 for the #1 spot in even popular single-turn benchmarks like Arc-E, Winogrande, Hellaswag, and ties Hermes-2 in PIQA. Nous-Hermes-2-Vision stands as a pioneering Vision-Language Model, leveraging advancements from the renowned OpenHermes-2. 2 on Apple Silicon macs with >= 16GB of RAM for a while now. So I created AlpacaCielo. Nous is very hit a miss with their datasets at times. I've noticed that MythoMax-L2-13B needs more guidance to use actions/emotes than e. 0 for censored general instruction-following. Some newer 13B models seem to be better than older 30Bs. We are planning on releasing different and more improved models soon Mythomax and Nous-Hermes-2-SOLAR showed perplexing responses sometimes, mentioning things that made no sense. 5-16K. 6 Mitral 8x7b Airoboros 70B Broody's Story Brainstorming LLM (NSFW) TierList v1. Seems like Llama 2 is better overall but it really depends on what you want and what you can run. /r/StableDiffusion is back open after the protest of Reddit killing open API access, which will bankrupt app developers, hamper moderation, and exclude blind users from Just tested it first time on my RTX 3060 with Nous-Hermes-13B-GTPQ. For immediate help and problem Puffin, it’s released by Nous ( the same org that released hermes) and it’s trained on thousands of real long context conversations people had with GPT-4. Know a guy who tried a bunch of different 7 and 13b models and chronos Hermes was reliably the best at carrying a plot, responding to various scenarios, and doesn't have any censorship. But now it's time for a new one. I use Wizard for long, detailed responses and Hermes for unrestricted responses, which I will use for horror(ish) novel research. upvotes My last post was almost two weeks ago (I know, it's an eternity in LLM land), and I updated it last week with Nous Hermes 2 - Mixtral 8x7B. AutoGPTQ 83%, ExLlama 79% and ExLlama_HF only 67% of dedicated memory (12 GB) used according to NVIDIA panel on Ubuntu. 5-Mistral-7B. It has extended 'memory' or max_seq_len or something. I don't use these models enough to tell if the general quality did change. 7B-Slerp. I've been using it to help me with writer's block as well as a starting point for writing blog posts. Would be interesting to see chronos-hermes too while you are at it (and also manticore-chat-pyg-guanaco but I don’t want to be greedy) In 13B family I liked Xwin-LM-13B where I want an instruction following model until I found Solar-10. exe --usecublas/clblas 0 0 --gpulayers %layers% --stream --smartcontext --model nous-hermes-llama2-13b. and nous-hermes-llama2-13b. The 4K Nous-Hermes-Llama2 is my current favorite Llama 2 model, but the 8K just didn't work as well for me, so hopefully NTK-Aware Scaling can bring it on par with the orignal. As for Chronos, it seems that it's designed for chat, roleplay, and storywriting. vicuna-13B-v1. 9-Q4_K_M. And yes, it is uncensored from I recently picked up the Earspasm trial kit containing one VK1 of each hardness. Not necessarily all 30B fine-tunes are better than every 13B fine-tune. 0-GPTQ model but they’re based on similar p settings that inproved nous Hermes 13B for me too, good luck. Takes about ~6-8GB RAM depending on context length. 5-Mistral-7B by teknium. Dutch Llama 2 13b chat. llama. /r/StableDiffusion is back open after the protest of Reddit killing open API access, which will bankrupt app developers, hamper moderation New Model Comparison/Test (Part 1 of 2: 15 models tested, 13B+34B) Winner: Mythalion-13B New Model RP Comparison/Test (7 models tested) Winners: MythoMax-L2-13B, vicuna-13B-v1. 4 (we need more benchmarks between the three!). Nous- Hermes & Puffin (13b) having opposite opinions When you as how you should defrost a frozen meal (in a glass container), they both prefer different approaches: Hermes --> cold water, slow defrost: Less bacteria growth I like Nous-Hermes-Llama2-13B, but after so long it starts outputting sentences which lack prepositions. Different models require slightly different prompts, like replacing "narrate" with "rewrite". Gaming. nous hermes 8x7b dpo Dolphin 2. Big LLM Score Update: TULU, Camel, Minotaur, Nous Hermes, Airoboros 1. 5-16k. After testing so many models, I think "general intelligence" is a - or maybe "the" - key to success: The smarter a model is, the less it seems to suffer from the repetition issue. Nous-Hermes-Llama2. Nous-hermes-70b wizard uncensored 1. 7B. stepping down from 13B and 20B Nous-Hermes-13b is a state-of-the-art language model fine-tuned on over 300,000 instructions. model-specific prompt format Hi, I've been using TheBloke_Chronos-Hermes-13B-SuperHOT-8K-GPTQ for light fiction writing. Valheim; Genshin Impact; Minecraft; 13B Nous Hermes Llama 2 13B (GGML q4_0) 16GB docker Get the Reddit app Scan this QR code to download the app now. q4 Its combination of intelligence and personality (and even humor) surpassed all the other models I tried, which include Airoboros, Chronos-Hermes, Falcon 180B Chat, Llama 2 Chat, MythoMax, Nous Hermes, Nous Puffin, and Samantha. Coding assistant: Whatever has the highest HumanEval score, currently WizardCoder. yet accurate name could have been Speechless-L2-Hermes-WizardLM-13B? Or may I suggest Speechermeswiz-L2-13b? Reddit's most popular camera brand-specific subreddit! We are an unofficial community of users of the Sony Alpha brand and This. Reply reply Thanks to our most esteemed model trainer, Mr TheBloke, we now have versions of Manticore, Nous Hermes (!!), WizardLM and so on, all with SuperHOT 8k context LoRA. Interestingly, both Pygmalion 13b and Mythomax 13b can't solve the puzzle by themselves but merge between them can. Solar hermes is generally the worst mainstream solar finetune I know of. 0 (and it's uncensored variants), and Airoboros 1. 5-Mistral-7B model, which is a DPO fine-tuned version of OpenHermes-2. i1-Q4_K_M 3 List of failed models with varying amounts of errors, almost all started write for user at some point, probably because of not optimal ST settings. Or check it out in the app stores     TOPICS EstopianMaid is another good 13b model, while Fimbulvetr is a good 10. Nous- Hermes & Puffin (13b) having opposite opinions I was testing some models with random questions I had to see differences, and I've found a curious difference: When you as how you I'm finding it to be as good or better than Vicuna/Wizard Vicuna/Wizard-uncensored models in almost every case. I just got a 13 Pro (NUC13ANHi7) and installed the optional second NIC and USB card. it's not intended to be more personable, it's intended to respond to prompts more intelligently. 7B old gen: . The main limitiation on being able to run a model in a GPU seems to be its That's unusual. Needs more testing 870K subscribers in the GPT3 community. OrcaMini is Llama1, I’d stick with Llama2 models. Valheim; Genshin Impact; Minecraft; Pokimane; Halo Infinite; Call of Duty: Warzone; It's a merge between a custom model and the Nous Hermes 13b. This model was fine-tuned by Nous Research, with Teknium and Karan4D leading the fine tuning process and dataset curation, Redmond Get the Reddit app Scan this QR code to download the app now. It tops most of the 13b models in most It performs not bad but worse than Nous-Hermes-13B. We are now offering you the opportunity to test the Nous-Hermes-Llama2-13b model, which has been finely tuned to elevate your Role Playing experience. Hi. Note this can be very tight on windows due to background vram usage. So I need 16% less memory for loading Get app Get the Reddit app Log In Log in to Reddit. Nous Hermes 13b is very good. This list is poorly tested (0-1 shots). bin llama_model When I ask Nous Hermes 13b to write a violent sexual scene it does it without complaining. The ones I downloaded were "nous-hermes-llama2-13b. Or check it out in the app stores   models like Mixtral which have more sensitive distributions, unless you can dial in juuust the right combo. I've been looking into and talking about the Llama 2 repetition issues a lot, and TheBloke/Nous-Hermes-Llama2-GGML (q5_K_M) suffered the least from it. Honestly Nous Hermes has always given me trouble with anatomy when dealing with futanari characters, so I avoid it. The result is an enhanced Llama 13b model that rivals GPT-3. I'm writing a spicy video game and these were the models I tried in reverse order. 2b New Model Comparison/Test (Part 1 of 2: 15 models tested, 13B+34B qualitative hallmarks of a dumber model, somewhere in-between 13b and 34B, while Looks like the DPO version is better than the SFT. I even tried forcing outputs to start a certain way, but it's still too "clean" to have any fun with. Thanks for all the tips. How Nous-Hermes-13B AI Model Can Help You Generate High-Quality Content and Code socialviews81. For the 34b, I suggest you choose Exllama 2 quants, 20b and 13b you can use other formats and they should still fit in the 24gb of VRAM. In chat+instruct mode it can be very descriptive and creative for such a small model. g. 37. For 13B, I again recommend Airoboros 13B, though Manticore-Chat-Pyg is really popular, and Nous Hermes 13B is also really good in my experience. It feels like Chronos Hermes 13B is more of a novel writer than a chat writer. Log In / Sign Up; Advertise on Reddit; Shop Collectible Avatars; nous-hermes-llama2-13b. It's quality, diversity and scale is unmatched in the current OS LM landscape. Greetings everyone, We have some great news for all our Role Playing enthusiasts. EDIT: OMG I feel so silly, SOLAR-10. 5 dataset, surpassing all Open Hermes and Nous Hermes models of the past, trained over Yi 34B with others to come! We In my experiences with 13b editions of Hermes and Puffin, Puffin was basically useless to me and never generated good output, while Hermes-Llama2-13B is the best overall 13b model I have used. So never have any issues with censorship and these models work perfectly for chat and roleplay. model-specific prompt format The main Models I use are wizardlm-13b-v1. Overall, these are excellent synthetic reeds that offer yet another option for us to experiment with. Mythologic and the llama1 chronoborus pretty similar. It also makes comparison slightly unfair since it cuts the content, and if 20% of the response is occupied by 'sure thing', one answer is always looking less complete. By default, Ollama uses 4-bit quantization. It would be blind if one of them didn't always answer with 'sure thing!' or similar prefix. MythoMax-L2-13B. Vicuna 1. Probably best to stick to 4k context on these. I just tried doing a scene using nous-hermes-2-solar-10. [ Nous-Hermes-13b is a state-of-the-art language model fine-tuned on over 300,000 instructions. model-specific prompt format View community ranking In the Top 5% of largest communities on Reddit. Every single model I load has an out of memory error; I've done 4bit quant 30b/33b models and 13b models. e. Even when my character card is totally OK with something like that. Fimbulvetr-11B-v2 by Sao10K (tested, 8B Stheno probably would be better) . model-specific prompt format Nous-Hermes-Llama-2-13b Puffin 13b Airoboros 13b Guanaco 13b Llama-Uncensored-chat 13b AlpacaCielo 13b There are also many others. bin" Move the models to the llama directory you made above. It New Model Comparison/Test (Part 1 of 2: 15 models tested, 13B+34B) Winner: Mythalion-13B New Model RP Comparison/Test (7 models tested) Winners: MythoMax-L2-13B, vicuna-13B-v1. Works on my server PCs and my primary PC (16GB RAM, 4GB VRAM). 12x 70B, 120B, ChatGPT/GPT-4. Also you probably aren't going to fully unlock the fine tuned pathways unless you're using an instruct format. model-specific prompt format /r/StableDiffusion is back open after the protest of Reddit killing open API access, which will bankrupt app developers, hamper moderation, and exclude blind users from the site. 8 GB with other apps such as steam, 20 or so chrome tabs with a twitch stream in the background. As a community can we create a common Rubric for testing the models? And create a pinned post with benchmarks from the rubric testing over the multiple 7B models ranking them over different tasks from the rubric. 2. cpp when streaming, since you can start reading right I'll do a comparison between Hermes-LLongMA-2-13B-8K with either scaling method. model-specific prompt format I have been testing out the new generation of models (airoboros 70B, nous hermes llama2, chronos hermes) So far, the models I've tried out are reluctant to use explicit language, no matter what characters I use them with. This subreddit has gone Restricted and reference-only as part of a mass protest against Reddit's recent API changes, which break third-party apps and moderation tools. runpod instead, though you'll be managing instance uptimes and Example: ollama run nous-hermes:13b-q4_0. Q4_K_M. Erosumika by a great guy, it looks like all his models and his accaunt were removed, but there's still some quants: GGUF 11B old gen: . Real Life! Reply reply Yes I've tried Samantha the editor, and my results with it were very very poor compared to whatever else I've tried. Looking forward to seeing how the big brother does. What's more exciting is that we've expanded the token limit up to a whopping 3500, instead of the standard 1800. model-specific prompt format LlamaGPT - Self-hosted, offline, private AI chatbot, powered by Nous Hermes Llama 2. Nous-Hermes-2-Yi-34B - Hermes 2 on Yi-34B. blogspot. Q4_K_M- 13B Xwin-MLewd-13B-V0. Try them out yourselves (tap the 'Open in Colab' button)! Model Colab Link Date Added Notes Link; Nous-Hermes-13B-GPTQ (using oobabooga/text-generation-webui) : 10; Nous Hermes is uncensored. Releasing Hermes-LLongMA-2 8k, a series of Llama-2 models, trained at 8k context length using linear positional interpolation My top three are (Note: my rig can only run 13B/7B): - wizardLM-13B-1. bin" and "Wizard-Vicuna-7B-Uncensored. At lower settings, it seems to end up eventually just free associating a bunch of words, almost like random (but not completely random, they are always somehow connected to each other and the story). It is a triple merge of Nous Hermes + Guanaco + Storytelling, and is an attempt to get the best of all worlds to get a smart & creative model. I've searched on here for info but I can't figure it out. 59 votes, 60 comments. The Reddit community of our Mobile Audio website mobileaudiophile. We are Reddit's primary hub Like every single model I used, except for Nous Hermes without using an instruction prompt, it understands I want to sign the mail as "Quentin" or similar. Subjectively speaking, Mistral-7B-OpenOrca is waay better than Luna-AI-Llama2-Uncensored, but WizardLM-13B-V1. Get the Reddit app Scan this QR code to download the app now. I was impressed by how well it could handle math problems. Chronos-Hermes-13B-v2: More storytelling than chatting, sometimes speech inside actions, not as smart as Nous-Hermes-Llama2, didn't follow instructions that well. 5-Mistral-7B or Nous-Hermes-2-Mistral-7B-DPO. However, after I updated the webUI to the latest version, I noticed that the model could no longer do math at all. View community ranking In the Top 5% of largest communities on Reddit. Full offload GGML performance has come a long way and is fully New Model Comparison/Test (Part 1 of 2: 15 models tested, 13B+34B) Winner: Mythalion-13B New Model RP Comparison/Test (7 models tested) Winners: MythoMax-L2-13B, vicuna-13B-v1. According to him. I find the 13b parameter models to be noticeably better than the 7b models although they run a bit slower on my computer (i7-8750H and 6 GB GTX 1060). Or check it out in the app stores   Manticore 13B (formerly Wizard Mega 13B) is now the top of the pack of 13B models Nous Hermes 13b is very good. MythoMax or Stheno L2, both do better at that than Nous-Hermes L2 for me. Narrate this using active narration and descriptive visuals. For 7b and 13b I definitely prefer it. Or check it out in the app stores I expect Mixtral 8x7B to take over the <70B space just like Mistral 7B took over the <13B space! Mistral-7B-Instruct-v0 New Model Comparison/Test (Part 2 of 2: 7 models tested, 70B+180B) Winners: Nous-Hermes-Llama2-70B, Synthia-70B-v1. Token issue with Nous-Hermes-Llama2-13b Question I'm using this model for privateGPT but when it generate prompts it keeps saying there's a 512 token limit with the model, but if I look at it's huggingface repo it says it's 4096 what can I do about this? I've tested Mythalion 13b, seems like a good replacement for Nous Hermes 2 13b ( my normal go to model ). It'll answer sensitive questions and follow directions. This will list could be outdated Get the Reddit app Scan this QR code to download the app now. It also reaches within 0. 7b and found that it quickly devolved into the bot endlessly repeating itself regardless of settings. 7~11. I'm not sure if they'll make any improvements to the new version, but as of now most people agree the best general purpose 7B is either OpenHermes-2. Redmond Puffin 13B Preview (Llama 2 finetune) Any plans for Nous Hermes on llama 2? Additional comment actions. 5 prompts. Meanwhile, ChatGPT failed at simple tasks that Hermes figured out. Install on umbrelOS home server, or anywhere with Docker NousResearch-Nous-Capybara-3B-V1. q4_0. But sometimes I'd problem made those creative model (Nous-Hermes,chronos, airoboros) follow instruction, those one View community ranking In the Top 1% of largest communities on Reddit. Nothing works. q4_k_m - 13B Fimbulvetr-11B-v2. Reply reply Spent many hours trying to get Nous Hermes 13B to run well but it's still painfully slow and runs out of memory (just with trying to inference). This may be the blind leading the blind, but I found this little table in llama. I also liked Airoboros-13B from Llama1 family for ability to reply to just about anything without batting an eye, even default Assistant character would answer anything from first question with no special context. Cool. q5_K_M Thank you! Reply reply Fun_Tangerine_1086 • https New Model Comparison/Test (Part 1 of 2: 15 models tested, 13B+34B) Winner: Mythalion-13B New Model RP Comparison/Test (7 models tested) Winners: MythoMax-L2-13B, vicuna-13B-v1. 6-mistral-7b is amazingly good at narrative content that's slightly spicy. model-specific prompt format Hey u/AlternativeMath-1, if your post is a ChatGPT conversation screenshot, please reply with the conversation link or prompt. Would it even be possible to train a 7B model, at the very least? I have a Ryzen 3600 but I'm planning on trying more of llama. I'll report back with my impression once I've tested this "Surpassing all Open Hermes and Nous Hermes models of the past" News huggingface. Honorable Every week I see a new question here asking for the best models. New IBM 7B/13B open source Models that beat equivalent Llama 2 and Mistral models on Winograd and MMLU /r/StableDiffusion is back open after the protest of Reddit killing open API access, which will bankrupt app developers, hamper moderation, and exclude blind users from the site. It doesn't get talked about very much in this subreddit so I wanted to bring some more attention to Nous Hermes. Currently, for 13Bs that's OpenOrca-Platypus. Reply reply More replies beetroot_fox Get the Reddit app Scan this QR code to download the app now In conclusion, Synthia has improved Mistral, but of course it remains a 7B and I'd still pick Mythalion 13B or even better one of the great 70Bs like Xwin, Synthia, or Hermes over this! New Model Comparison/Test (Part 2 of 2: 7 models tested, 70B+180B) Winners: Nous-Hermes Nous-Hermes-2-SOLAR-10. New Model Comparison/Test (Part 1 of 2: 15 models tested, 13B+34B) Winner: Mythalion-13B New Model RP Comparison/Test (7 models tested) Winners: MythoMax-L2-13B, vicuna-13B-v1. OpenHermes-2-Mistral-7b was the first model I've used that could tool-chain well and follow instructions, looking forward to testing this out. 2's text generation still seems better 7B: Nous Hermes Mistral 7B DPO. I like it. 6 Mixtral 8x7B Are the best model, you have a better one please tell me, because that what I use I wasn't able to find better onces (taking into account censorship and pricing) (because obviously claud 3 or GPT4 would beat Nous-Hermes-Llama2 (very smart and good storytelling) vicuna-13B-v1. The replies aren't as long as Poe's, but they're well written, in character, and with little to no repetition, although I sometimes I'm running ooba Text Gen Ui as backend for Nous-Hermes-13b 4bit GPTQ version, with new exllama and exllama-hf, it's real fast on my local 3060. It's also funny how people are using an instruct model like wizard to have a conversation in the regular style, just because the model is good enough to still work. Q5_K_M. It completely replaced Vicuna for me (which was my go-to since its release), and I prefer it over the Wizard-Vicuna mix (at least until there's an uncensored mix). Yes. Chronos-13B-v2: Got confused about who's who, over-focused one plot point early on, vague, stating options instead of making choices, seemed less smart. It maybe helps it's prose a little, but it gives the base model a serious downgrade in IQ that isn't worth the squeeze. I've done minimal testing, but so far it works pretty well and I much prefer its outputs over base Hermes. compress_pos_emb = 2. This model was fine-tuned by Nous Research, with Teknium and Karan4D leading the fine tuning process and dataset curation, Redmond These models can be run on consumer hardware and are generally good (from Reddit suggestions and my own experience). Would like to see a Nous Hermes 2 Miqu! serpdotai/sparsetral-16x7B-v2 HF, 4K context, ChatML format: Gave correct answers to only 3+3+4+5=15/18 multiple choice questions! Just the questions, no previous information, gave correct answers: 1+1+0+5=7/18 Nous-Hermes-Llama2. But if I ask the same to Nous Hermes 13b superHOT 8k it gives me "ethical" advice or just refuses to do it. Moistral-11B-v4 Have you tried the latest Llama 3 8B merge with Kunoichi, Tiefighter, Fimbulvetr, and Nous Hermes? They were specifically for the airoboros-l2–13B-m2. My usual prompt goes like this: <Description of what I want to happen>. It's the closest I can get to what I had with my SoulmateAI. ). 0 0. 1, Synthia-70B-v1. Or check it out in the app stores (Part 2 of 2: 7 models tested, 70B+180B) Winners: Nous-Hermes-Llama2-70B, Synthia-70B-v1. It sort of managed to solve my logic puzzle that stumbles other LLMs ( even GPT4 ). 2, Chronos Hermes upvotes Nous Hermes 13b is very good. q4_K_S. Actually any parameters on mirostat v2 solve the problem, even the default ones (2 5. Go figure. true. We ask that you please take a minute to read through the rules and check out the resources provided before creating a post, especially if you are new here. Just having a greeting message isn't enough to get it to copy the style, ideally your character card should include examples and your own first message should also look like what you want to get back. Thanks! We have a public discord server. as long as they are high quality and they aren't against Reddit ToS. I just released an update to my macOS app to replace the 7B Llama2-Uncensored base model with Mistral-7B-OpenOrca. 🚀 Personal Projects. 1). I tried various loaders like exllama and the others in the dropdown that I recognized the name of. I guess these new models are still "fresh behind the ears". General intelligence: Whatever has the highest MMLU/ARC/HelleSwag score, ignore truthfulQA. 7b. A few BramVanroy/Llama-2-13b-chat-dutch · Hugging Face Here it is: Nous-Hermes-Llama-2 13b released, beats previous model So I'm basically wondering is there any 13B models that are really good at this, such as chat uncensored, orca, nous hermes, or are they kind of severely lacking next to their 70B counterparts to a degree where it might make more sense to use an API, or website to access something larger for this more occasional use. And many of these are 13B models that should work well with lower VRAM count GPUs! I recommend trying to load with Exllama (HF if possible). Announcing Nous-Hermes-13b (info link in thread) I get pretty decent results running chronos-hermes-13b on collab. 5, a model trained on the Open Hermes 2 dataset but with an added ~100k code instructions created by Glaive AI A good day for the Nous dudes. At the 33B level you're getting into some pretty beefy wait times, but Wizard-Vic I came to the same conclusion while evaluating various models: WizardLM-7B-uncensored-GGML is the uncensored version of a 7B model with 13B-like quality, according to benchmarks and my own findings. . co However, there is a 34b codellama model that came out a bit after llama 2 7b/13b/70b, and it has its own finetunes based on it for both coding and general use. Using this settings, no OOM on load or during use and context sizes reaches up to 3254~ and hovers around that value with max_new_token set to 800. There new model on the block called Camel available as 13B and 33B 70B: Xwin-LM-70B-V0. 5-16K Big Model Comparison/Test (13 models tested) Winner: Nous-Hermes-Llama2 SillyTavern's Roleplay preset vs. com Open. "Surpassing all Open Hermes and Nous Hermes models of nous hermes is basically just an instruct dataset iirc. Share Add a Comment. On my personal huggingface, Teknium, I have released several models, including my work on Replit-3b Model & OpenHermes: These models can be run on consumer hardware and are generally good (from Reddit suggestions and my own experience). For those of you haven't tried it, do -- its worth it. Hermes 3 was created by fine-tuning Llama 3. Nous Hermes L2 13B-4 bits, has me really surprised, been using it for many days and now is my clear favorite. You should try models that are lower profile, but more popular with the community. If anyone figures out a way to get it to stop talking or acting on behalf of my character, that'd be a plus. 5 I believe it also did relatively well, but the Llama-2 chat has some flair to it, such as asking itself "Why should we unsuspend this account", which was a nice touch. 10. I recommend you try one of those. After going through many benchmarks, and my own very informal testing I've narrowed down my favorite LLaMA models to Vicuna 1. Our training data aggressively encourages the model to follow the system and instruction prompts exactly and in an adaptive manner. We won't ban you, 12gb is sufficient for 13b full offload using the current version of koboldcpp, as well as using exllama. This model was fine-tuned by Nous Research, with Teknium and Karan4D leading the fine tuning process and dataset curation, Redmond AI sponsoring the compute, and several other contributors. More info on huggingface: AlpacaCielo. We are Reddit's primary hub for all things modding, from troubleshooting for beginners to creation of mods by experts. ) My entire list at: Local LLM Comparison Repo Out of all the models I've been trying so far in ST, I've been having the best results so far with Chronos Hermes 13B. 2 and Nous-Hermes-2-Yi-34B. 1 is looks really great right now. And it has had support for WizardLM-13B-V1. They're actually more cost-efficient than Colab in terms of compute and storage when you run the numbers and TBH probably your best bet for fully managed cheap jupyter, but you can save money if you use e. The subreddit for AI text generation technology "Open Hermes 2. gguf for creative writing, and probably for giving my IRC bots conversational capabilities (a work in progress), puddlejumper-13b-v2. If you want to upgrade, best thing to do would be vram upgrade, so like a 3090. I've been using Hermes so far which seems to be the most coherent. EDIT, I meant NOUS hermes, not chronos, these all blend together. I'm curious how the latest Nous Hermes 2 Mistral compares to Mistral 7B v0. q5_K_M. Try them out yourselves (tap the 'Open in Colab' button)! If you find this project helpful, please share it: Nous-Hermes-13b is an innovative language model developed through a collaborative fine-tuning process involving Nous Research, Teknium, Karan4D, and other contributors. 2b, Nous-Hermes-Llama2-70B 13B: Mythalion-13B But MXLewd-L2-20B is fascinating me a lot despite the technical issues I'm having with it. The number after the q represents the number of bits used for quantization (i. 0 - Nous-Hermes-13B - Selfee-13B-GPTQ (This one is interesting, it will revise its own response. I tried both versions on Nous-Hermes-Llama2 13B and it seems to work so far without those annoying repetitions. Nous-Hermes-Llama2-13b - A Hermes model built on llama 1 and llama 2. 2b New Model Comparison/Test (Part 1 I tried the speechless-llama2-hermes-orca-platypus-wizardlm-13b. Hermes 3 contains advanced long-term context retention and multi-turn conversation capability, complex roleplaying and internal monologue abilities, and enhanced agentic function-calling. 1 model. The official Python community for Reddit! Stay up to date with the latest news, packages, and meta information relating to the Python programming But my favorite models have changed in the meantime, I now use Llama 2 13B models instead of the old LLaMA (1) 30Bs, for better quality and speed: MythoMax-L2-13B (smart and very good storytelling) Nous-Hermes-Llama2 (very smart and good storytelling) Unfortunately, while this model does write quite well, it still only takes me about 20 or so messages before it starts showing the same "catch phrase" behavior as the dozen or so other LLaMA 2 models I've tried. Or check it out in the app stores   Psyfighter v2 13B Perplexity PPLX 70B Pygmalion - Mythallion 13B Remm SLERP 13B Mistral 8x7B Nous Hermes 2 - Yi 34 B Dolphin 2. I've added some models to the list and expanded the first part, sorted results into tables, and Two weeks ago, I downloaded the Nous-Hermes 13B model and tested it with the default preset, which was “LLaMA-Precise” at the time. Take is a simple proof of concept: I used Intel's orca_dpo_pairs (from neural-chat-7b-v3-1) in a ChatML format, and Finally! After a lot of hard work, here it is, my latest (and biggest, considering model sizes) LLM Comparison/Test: This is the long-awaited follow-up to and second part of my previous LLM Comparison/Test: 2x 34B Yi (Dolphin, Nous Capybara) vs. This is a follow-up to my previous posts here: New Model RP Comparison/Test (7 models tested) and Big Model Comparison/Test (13 models tested) Originally planned as a single test of 20+ models, I'm splitting it up in two segments to keep the post managable in size: First the smaller models (13B + 34B), then the bigger ones (70B + 180B). cpp to see if it makes a difference offloading a bit of work to the CPU Please note storage is not included in this and is fairly expensive for both block and shared drives. And with the Bugfix for microstat v2 in Koboldcpp 1. Takes about ~4-5GB RAM depending on context length. Thanks for the recommendation. 5-16K (16K context instead of the usual 4K enables more complex character setups and much longer stories) 70B models would most likely be even better, but my system doesn't let me run them with acceptable speed for realtime chat, so best for me are currently these 13Bs. awq which was a poor choice, but I can go back and grab the exl2 instead. I've run my usual tests and updated my rankings with a diverse mix of 6 new models from 1. Teknium, the creator of the SFT model, confirmed on Twitter that this version improves benchmark scores in AGIEval, GPT4All, and TruthfulQA. tyranzero • at the other side, they said that https the prompt format I use is: """ Title: Eternal Flame Tags: Religious Zealotry, Space Opera, Futuristic, Interstellar, Bleak, Dry, Sad, Tragic Get the Reddit app Scan this QR code to download the app now. 7B: Nous Hermes 2 SOLAR 10. Nous Hermes, Airoboros 1. But it takes a longer time to arrive at a final response. 8x7B: Nous Hermes 2 Mixtral 8x7B DPO. 2 by ChaoticNeutrals. Expand user menu Open settings menu. Thanks for training/sharing this @NousResearch. 3, WizardLM 1. But I expected both to do better than that. if you're in a chat format you'll just kind of get some of it by osmosis I have found that the Nous-Hermes-Llama2-13b model is very good for NSFW, provided I set the "TEMPERATURE" setting as high as possible. My current task list may not be suitable for comparing it against other models. I'll check few models again with different ST settings: Get the Reddit app Scan this QR code to download the app now. Having a 20B that's faster than the 70Bs and better than the 13Bs would be very welcome. 38 votes, 16 comments. com and same samed facebook I just tried Nous Hermes 13b a bit and I noticed he gets incoherent faster after 2k tokens. This is version 2 of Nous Research's line of Hermes models, and Nous Hermes 2 builds on the Open Hermes 2. I'm always using SillyTavern with its "Deterministic" generation settings preset and the new "Roleplay" instruct mode preset with these settings. jfixz wgmukg iyju vrcbey cjgrnfrh oaqo jitizw fvcl fnuoz lrdpck