Qwen2.5: A Party of Foundation Models!

brucethemoose@lemmy.world · edit-2 16 hours ago

Llama 3.1 is not even a “true” distillation either, but its kinda complicated, like you said.

Yeah Qwen undoubtedly has synthetic data lol. It’s even in the base model, which isn’t really their “fault” as its presumably part of the web scrape.

brucethemoose@lemmy.world · edit-2 1 day ago

You can use larger “open” models through free or dirt-cheap APIs though.

TBH local LLMs are still kinda “meh” unless you have a high vram GPU. I agree that 8b is kinda underwhelming, but the step up to like Qwen 14B is enormous.

brucethemoose@lemmy.world · 1 day ago

I don’t think Qwen was trained with distillation, was it?

It would be awesome if it was.

Also you should try Supernova Medius, which is Qwen 14B with some “distillation” from some other models.

brucethemoose@lemmy.world · edit-2 3 days ago

All the other brands went along

(My 2020 G14 has 3 A ports and ethernet, but still…)

brucethemoose@lemmy.world · 3 days ago

Yes blocking major backers is bad, I agree with that. The mod behind this kind of sounds unpleasant too.

What does being on Lemmy matter?

Discord is like the antithesis of Lemmy, a siloed off, inefficient, unscrapable, private, proprietary and dangerously monopolistic echo chamber. I’ve seen it swallow too many of my niches, and from my experience, it turns people into jerks.

Hence what I’m getting at is that this may not have happened without all that nonsense in the unoffiical discord.

brucethemoose@lemmy.world · 3 days ago

You mean a few github accounts and a bunch of jerks on Discord, which you are bringing up on Lemmy?

If anything this is just another repudiation of Discord as a whole. I hate how its eating the internet like mad cow disease.

brucethemoose@lemmy.world · edit-2 4 days ago

100%

My mom went to an integrated school in the South, made friends… but sometimes overheard racist slurs and threats behind closed doors. Same story with family I have now, all pleasent in public, friends with some gay family members. But vehemently anti-vaccine and such behind closed doors… I have horrible stories I can’t even repeat.

The duality is unreal.

A question is where that behind-doors comes from… a lot is from church. Church like you’ve never seen if you haven’t been to the South.

brucethemoose@lemmy.world · 6 days ago

This is the last antitrust win we’ll get for years, isn’t it?

I know Trump doesn’t like Big Tech, but I doubt his admin will punish them meaningfully, but just rail about censorship.

brucethemoose@lemmy.world · 6 days ago

I fear this is exactly who they’re courting.

brucethemoose@lemmy.world · edit-2 8 days ago

I feel like they are dropping the ball in the GPU space though, both on desktop and in servers.

They’renot really leveraging it. They killed the steam deck line of “small core count, GPU heavy APUs” which is why Valve hasn’t updated it and competitors seem so power hungry. They all but killed server APUs, making them mega expensive and HPC only. They’re finally coming out with a M-Pro like consumer APU, but it took until 2025, and pricing will probably be a joke just like their Radeon Pro GPUs…

And I don’t even wanna get into the AI space. They get like 99% there and then go “nah, we don’t really care about this market, let Nvidia have their monopoly and screw everyone over.” It makes me want to pull my hair out.

brucethemoose@lemmy.world · 8 days ago

Unless you have an Nvidia card.

I’ve been on linux for years, I work the Nvidia libraries all the time, I alternate booting wayland and X… I even use my AMD IGP as output these days, instead of the Nvidia card.

And I STILL hold my breath wondering if I’m going to get a blackscreen, and have to go into tty mode or boot from a usb stick to investigate and fix it.

brucethemoose@lemmy.world · edit-2 9 days ago

I never liked Musk, even when he was “In.” Even the Mars colonization meme rubbed me the wrong way, as the science does not line up with that.

It felt like a cult of personality to me. He was always a fickle jerk, a mixed bag.

You have a point though, people’s opinions were largely political, I think. Or just based on pure hope/cultism

brucethemoose@lemmy.world · edit-2 9 days ago

+1

They should discourage institutions from using it (and use government Mastadon instances of course). This is honestly long overdue.

brucethemoose@lemmy.world · 12 days ago

Don’t jinx it.

Especially not if they somehow coincidentally get some government funding.

brucethemoose@lemmy.world · 13 days ago

I’d posit the algorithm has turned it into a monster.

Attention should be dictated more by chronological order and what others retweet, not what some black box thinks will keep you glued to the screen, and it felt like more of the former in the old days. This is a subtle, but also very significant change.

brucethemoose@lemmy.world · edit-2 13 days ago

On the other hand, the track record of old social networks is not great.

And it’s reasonable to posit Twitter is deep into the enshitifiication cycle.

brucethemoose@lemmy.world · 14 days ago

Still perfectly runnable in kobold.cpp. There was a whole community built up around with Pygmalion.

It is as dumb as dirt though. IMO that is going back too far.

brucethemoose@lemmy.world · 14 days ago

People still run or even continue pretrain llama2 for that reason, as its data is pre-slop.

brucethemoose@lemmy.world · edit-2 14 days ago

The facebook/mastadon format is much better for individuals, no? And Reddit/Lemmy for niches, as long as they’re supplemented by a wiki or something.

And Tumblr. The way content gets spread organically, rather than with an algorithm, is actually super nice.

IMO Twitter’s original premise, of letting novel, original, but very short thoughts fly into the ether has been so thoroughly corrupted that it can’t really come back. It’s entertaining and engaging, but an awful format for actually exchanging important information, like discord.

brucethemoose@lemmy.world · 14 days ago

This is called prompt engineering, and it’s been studied objectively and extensively. There are papers where many different personas are benchmarked, or even dynamically created like a genetic algorithm.

You’re still limited by the underlying LLM though, especially something so dry and hyper sanitized like OpenAI’s API models.

brucethemoose@lemmy.world · 2 months ago

Qwen2.5: A Party of Foundation Models!

brucethemoose@lemmy.world · edit-2 3 months ago

Cohere Drops Command-R 35B 08-2024 Update, Just About a Perfect Local LLM for 24GB GPUs.