Web3Gen0 pfp

Web3Gen0

@web3gen0

212 Following
138 Followers


Web3Gen0 pfp
Web3Gen0
@web3gen0
MultiTalk: Generating Multi-Person Conversations from Just Audio Researchers from Meituan, HKUST, and Sun Yat-sen University have introduced MultiTalk, a new framework that brings multi-person audio-driven conversational videos to life. Unlike earlier methods that only focused on single-person talking heads, MultiTalk handles multi-stream audio, ensures correct lip sync for each individual, and follows detailed scene instructions like “a man and a woman were talking, and then they kissed.” A key innovation is Label Rotary Position Embedding (L-RoPE), which helps bind the right audio stream to the right person. The model also preserves instruction-following through clever training strategies like partial parameter and multi-task training. From virtual actors to e-commerce livestreams, the potential use cases are huge. https://arxiv.org/pdf/2505.22647v1
0 reply
0 recast
1 reaction

Web3Gen0 pfp
Web3Gen0
@web3gen0
🚧 Building a Food Additive Checker Web App — Need Better Data Sources! I'm working on a web app (built with Langflow, Python, and Streamlit) that lets users upload food label images. It extracts ingredients via OCR, auto-generates reports on food additive safety, and allows users to ask questions about those additives. Previously, I used a manually created PDF as the knowledge base—but it wasn’t scalable or reliable. I'm now looking for better, structured sources or APIs that can: > Identify additives from OCR text (like “E330” or “sodium benzoate”) > Indicate whether they’re safe to consume > Optionally support Q&A or context-aware querying Is there any API, open dataset, or model context protocol (MCP) you’d recommend? Thanks in advance! 🙏
0 reply
0 recast
1 reaction

Web3Gen0 pfp
Web3Gen0
@web3gen0
Having worked on a responsible AI framework for 8 months, I’ve seen how hard it is to get teams to follow proper safeguards. Especially around copyright and content use. Cases like Disney and Universal suing Midjourney show why compliance isn’t optional. Many still treat it as a checkbox, but in reality, it's a foundation for building AI we can trust. 🙂
0 reply
0 recast
1 reaction

Web3Gen0 pfp
Web3Gen0
@web3gen0
🚨 New from Mistral: Magistral Released! Mistral has just announced Magistral, a dual-release model built for real-world reasoning and feedback-driven learning. 🔹 Magistral Small (24B, open-source) 🔹 Magistral Medium (enterprise-grade) On AIME2024, they scored: 📊 Magistral Medium: 73.6% (90% w/ majority voting) 📊 Magistral Small: 70.7% (83.3% w/ voting) Highlights: 🧠 Native chain-of-thought across languages ⚡ 10x faster answers with Think mode and Flash Answers 🔬 Backed by a new paper detailing model training, RL techniques & reasoning benchmarks Magistral Small is open to the community... ready to be explored, improved, and integrated. A solid step forward in building thinking language models. https://mistral.ai/news/magistral
0 reply
0 recast
1 reaction

Web3Gen0 pfp
Web3Gen0
@web3gen0
2 eth soon
0 reply
0 recast
1 reaction

Web3Gen0 pfp
Web3Gen0
@web3gen0
Thanks for sharing. Ilya is currently one of the smartest people on earth.
0 reply
0 recast
1 reaction

Web3Gen0 pfp
Web3Gen0
@web3gen0
Just read an eye-opening article on the real costs of running open-source LLMs. This is highly recommended for teams considering in-house deployments. If you're assuming open-source means free, this breaks down why "the download is free, but the cost is operational." A must-read for anyone budgeting beyond the API hype. https://artificialintelligencemadesimple.substack.com/p/the-real-cost-of-open-source-llms
1 reply
0 recast
4 reactions

Web3Gen0 pfp
Web3Gen0
@web3gen0
Lets see what the future holds for us.
0 reply
0 recast
0 reaction

dyor pfp
dyor
@dyor.eth
GM replyguys, USDC raindrop time. Will do $2 recast tips for the best replies and a $5 "ok banger" recast for a random post. Followers only.
84 replies
99 recasts
252 reactions

Web3Gen0 pfp
Web3Gen0
@web3gen0
What would be some of the MCPs you would choose to demonstrate for non-tech audience?
0 reply
0 recast
0 reaction

Web3Gen0 pfp
Web3Gen0
@web3gen0
I’ve worked for 8 months on a Responsible AI project, where we were encouraged to design AI solutions that adhere to Responsible AI principles and remain compliant with ethical and regulatory standards. However, in practice, many developers often overlook these guidelines, either due to tight deadlines or lack of awareness. AI does have the potential to benefit working people, but that benefit hinges on how thoughtfully it is developed and deployed. We can’t afford to treat Responsible AI as an afterthought... it needs to be baked into the process, not bolted on later. Embracing AI also means embracing the responsibility that comes with it.
0 reply
0 recast
0 reaction

Web3Gen0 pfp
Web3Gen0
@web3gen0
Just dropped: FLUX.1 Kontext: a whole new way to generate and edit images. Unlike traditional models, FLUX.1 lets you use both images and text to guide creation. Want to extract a concept from one image and remix it into another with your own prompt? This model suite gets it done... your images, your words, your world. Backed by generative flow matching, FLUX.1 Kontext takes in-context image generation to the next level. Explore the tech and the BFL Playground in the official post 👉https://bfl.ai/models/flux-kontext
3 replies
0 recast
2 reactions

Web3Gen0 pfp
Web3Gen0
@web3gen0
👀
0 reply
0 recast
1 reaction

Web3Gen0 pfp
Web3Gen0
@web3gen0
Just came across this fascinating concept: the Darwin Gödel Machine A self-improving AI that rewrites its own code to get better at programming. Loved reading Richard's explanation (link below) on it. It's inspired by the Gödel Machine idea but takes a more practical route using open-ended algorithms and foundation models. Basically: AI learning how to learn, better. What do you think about AI improving its own learning methods like this? https://richardcsuwandi.github.io/blog/2025/dgm/
0 reply
0 recast
1 reaction

Web3Gen0 pfp
Web3Gen0
@web3gen0
Some snaps from AI for Bharat in Kolkata with 90+ builders!
0 reply
0 recast
2 reactions

Web3Gen0 pfp
Web3Gen0
@web3gen0
If you are into software developer, grab your slice from 1m USD. https://hackathon.dev
0 reply
0 recast
3 reactions

Web3Gen0 pfp
Web3Gen0
@web3gen0
Thanks @bitfloorsghost.eth for the tip!
0 reply
0 recast
0 reaction

Web3Gen0 pfp
Web3Gen0
@web3gen0
Xiaomi has unveiled MiMo-VL-7B, a lightweight yet powerful vision-language model featuring a high-resolution ViT encoder, efficient MLP projector, and a custom MiMo-7B language model for advanced reasoning. Trained through multi-stage pretraining and Xiaomi’s new Mixed On-policy Reinforcement Learning (MORL), MiMo-VL-7B-RL excels in perception, grounding, and logical reasoning that marks a strong step forward in efficient multi-modal AI. https://huggingface.co/XiaomiMiMo/MiMo-VL-7B-RL
0 reply
0 recast
2 reactions

Web3Gen0 pfp
Web3Gen0
@web3gen0
At the current stage, if I pitch a performance-based revenue model where we earn based on clearly defined outcomes like reduced human intervention, it would be seen as innovative and aligned with client value. This approach could ease change management and stand out in industries like manufacturing, where ROI clarity is often a challenge. However, even with clear success metrics, internal stakeholders would likely be cautious. They’d be concerned about revenue unpredictability and delivery accountability. A fully performance-only model may feel too risky, so they’d probably prefer a hybrid version such as a base fee plus outcome-based incentives, where charges adjust depending on the level of human support needed. To gain buy-in, I’d need a strong, small-scale proof of concept showing impact and feasibility. Positioning it as a low-risk pilot rather than a full shift would make it more acceptable to both internal teams and clients, while laying the groundwork for broader adoption.
1 reply
0 recast
1 reaction

Web3Gen0 pfp
Web3Gen0
@web3gen0
Hey, I just tried out Chatterbox by Resemble AI. Pretty wild how fast it clones voices! 😮 Have you given it a shot yet? Also curious about what are your thoughts on how we should handle the risks that come with voice cloning tech? https://huggingface.co/ResembleAI/chatterbox
0 reply
0 recast
1 reaction