Found his record in Russia's official company registry. This is what he officially does as an entepreneur:
56.10 — Restaurant activities and food delivery services
47.23 — Retail sale of fish, crustaceans, and mollusks in specialized stores
47.25.12 — Retail sale of beer in specialized stores
47.25.2 — Retail sale of soft drinks in specialized stores
47.29.39 — Retail sale of other food products in specialized stores, not included in other groups
68.20 — Lease and management of own or leased real estate
Money is reinvested into selling beer and fish :) Interestingly, he registered all that in 2019, just when the ransoms started.
I find it entertaining that even as part of a Russian hacking gang, the real threat is the Russian tax authorities. Regardless of how you got the money, need to pay the taxes.
Schukin isn't a very common last name (definitely not Ivanov-tier). The first name, the patronymic (his father is Maksim) and the last name all match, as well as the city (the article says he lives in Krasnodar). In fact, this Krasnodar-based entrepreneur is the only person that shows up in the search at all for "Daniil Maksimovich Schukin". Not to say the business was registered right when the ransoms started (2019). Too many coincidences if it's just a namesake.
Qwen3.5 comes in various sizes (including 27B), and judging by the posts on HN, /LocalLlama etc., it seems to be better at logic/reasoning/coding/tool calling compared to Gemma 4, while Gemma 4 is better at creative writing and world knowledge (basically nothing changed from the Qwen3 vs. Gemma3 era)
For llama-server (and possibly other similar applications) you can specify the number of GPU layers (e.g. `--n-gpu-layers`). By default this is set to run the entire model in VRAM, but you can set it to something like 64 or 32 to get it to use less VRAM. This trades speed as it will need to swap layers in and out of VRAM as it runs, but allows you to run a larger model, larger context, or additional models.
Indeed, thanks for pointing this out and the links. With the excitement I misread that it was an MR from the fork to the main project.
I don’t think I’m able to fix the title though.
I find it quite exciting to read some results in an effort to understand if TurboQuant main ideas can be applied to model weights. There are other similar projects, so we’ll see, but it seems some of this fork results look promising.
>One theory is that the knowledge required to solve the task is already stored in the parameters of the model, and only the style has to change for task success
>In particular, learning to generate longer outputs may be possible in few parameters
>we develop budget forcing to control test-time compute by forcefully terminating the model’s thinking process or lengthening it by appending “Wait” multiple times to the model’s generation when it tries to end. This can lead the model to double-check its answer, often fixing incorrect reasoning steps
Maybe, indeed, the model simply learns to insert the EOS token (or similar) later, and the capability is already in the base model
Last time someone asked to take down a post, you said "bitch come suck my dick" according to your own blog.
reply