Alex

@alexdyor

246 Following

46 Followers

Alex pfp

OpenAI showed ChatGPT agent — it's Deep Research and Operator in one bottle The new agent can multimodally browse web pages, call APIs and tools, and perform tasks with reasoning. Special emphasis on using various tools — the agent was specially trained through RL to work with tools. It creates diagrams, presentations, generates images, can log in to sites and use the terminal. The result on Humanity's Last Exam is 42%, which is a serious jump compared to o3 and even Deep Research. There is also noticeable progress on Frontier Math. It's cool to watch such breakthroughs

0 reply

0 recast

0 reaction

Alex pfp

By the way, Perplexity also launched its AI browser - Comet. (https://comet.perplexity.ai/) It has an agent assistant in the sidebar. In addition to the usual contextual answers from tabs, this agent can also: search, compare products, make reservations, respond to emails and order pizza. The assistant "sees" pages, analyzes YouTube, Google Docs or PDFs and provides summaries or insights. Everything is built on Chromium, so you can import extensions, bookmarks and settings from Google Chrome with one click. Security? Data is stored locally, Comet does not train the model on personal data and includes an ad blocker. So far, access is only for Perplexity Max - $200/month, so we are waiting for it to be rolled out to the public. Support for Mac and Windows, and a mobile version is also on the horizon.

0 reply

0 recast

0 reaction

Alex pfp

Replit Launches Dynamic Intelligence Replit — in my humble opinion, one of the best tools for rapid prototyping — launched a new Dynamic Intelligence feature (https://blog.replit.com/dynamic-intelligence) and made an attempt to improve vibe coding results due to: • Extended Thinking — the agent pauses, thinks and opens up the course of its reasoning before giving an answer. • High Power Model — a more powerful LLM is connected upon request to rewrite the architecture, catch bugs, and improve performance. • Web Search — the agent itself goes online, fills gaps in knowledge and brings fresh solutions directly to the code. The functionality is already available — you just need to add Use Web Search to the prompt or enable Extended Thinking.

0 reply

0 recast

0 reaction

Alex pfp

Limits on image generation have been removed on Freepik. No more credits, queues, expected. Full unlimited. For subscribers of the Premium+ and Pro plans - unlimited generation for the following picture models: Mystic, Google Image, Flux, Seedream, Ideogram, Runway References, GPT Image1 and our Classic Models No tokens. no caps. no waiting. Generate as much as you want. We are waiting for a reply from Crea and other aggregators..

0 reply

0 recast

0 reaction

Alex pfp

very good idea)))

0 reply

0 recast

0 reaction

Alex pfp

no need to thank)))) : https://x.com/townsxyz/status/1941474803694903617

0 reply

0 recast

0 reaction

Alex pfp

Perplexity Announces Its Own AI Browser CEO Perplexity announced the imminent release of their new AI browser. The product is almost ready and will soon be publicly available. You can now apply for early access. (https://www.perplexity.ai/comet/) The developers are completing final testing. Something similar to Operator is expected, but with a more user-friendly interface. The main feature is an agent that performs any actions in the browser: from searching to paying bills, booking and working with documents. I wonder how it will work in practice

0 reply

0 recast

0 reaction

Alex pfp

🔥 Suno has a serious competitor The Wondera AI neural network has appeared, which creates studio-level musical compositions in a matter of seconds. And yes, it really generates in real time with the ability to instantly edit: editing, changing timbre characteristics, adding various audio effects. Everything works through the chat window and you can generate a certain amount for free. But to download - you need a subscription)) The resulting tracks are of professional quality - this is a full-fledged recording studio on your computer. The creators actively teach users, demonstrating the correct techniques for composing prompts to create high-quality compositions. The collection presents hundreds of musical works as examples for inspiration. Find out what kind of Beethoven you are today: here (https://www.wondera.ai/)

0 reply

0 recast

0 reaction

Alex pfp

Google launches Doppl - AI fitting directly from the photo A new addition that brings your photo to life in the form of a video, children are already in a different light. It doesn’t just replace the clothes, but shows what it looks like when it’s ruined. Just like fitting, just without fitting. 1️⃣ Vanish your photo 2️⃣ Choose an image 3️⃣ Wonder what’s realistic to sit and what’s a fashion disaster An ideal case when you need to evaluate onions without any harsh words. Mirror, stylist and rail - all in one. Also available on iOS and Android (only in the USA) (https://blog.google/technology/google-labs/doppl/)

0 reply

0 recast

1 reaction

Alex pfp

ElevenLabs launches its own Siri with support for Perplexity and Notion The company, known for its audio models, has taken a step forward and created a full-fledged voice assistant. Recently, they have already amazed the world with the best text2speech model with incredibly realistic voices. Instead of reinventing the wheel, they simply combined their Eleven v3 model with RAG, Perplexity and useful MCPs: Linear, Notion, Slack, etc. Main advantages: • Minimal response delay • Choice of 5000 voices (including your own) • Ability to add any MCP You can poke here (11.ai/)

0 reply

0 recast

0 reaction

Alex pfp

Seedance 1.0 Pro (not Lite or Mini) has been rolled out on Fal.ai! Go test it! They probably have the fattest set of video generators there: 1️⃣Seedance 1.0 Pro (and Lite) 2️⃣Hailuo 02 Pro (and Standard) 3️⃣Veo 3 4️⃣Kling 2.1 Master (and Standard/Pro) https://fal.ai/models/fal-ai/bytedance/seedance/v1/pro/image-to-video image2video is also available.

0 reply

0 recast

0 reaction

Alex pfp

MiniMax is holding a week of releases — two powerful models have already been released Chinese startup MiniMax is following DeepSeek and is holding its own week of releases. What has already been shown: On Monday — the M1 reasoning model in open source with a million context tokens (a record!). It cost only $500k, and according to the results it is approaching Gemini 2.5 Pro. Report, GitHub (https://github.com/MiniMax-AI/MiniMax-M1) and scales. (https://huggingface.co/collections/MiniMaxAI/minimax-m1-68502ad9634ec0eeac8cf094) Yesterday — text/image2video model Hailuo 2 with advanced physics and movements. You can try poking for free. (https://hailuoai.video/create) We are waiting for them to release today

0 reply

0 recast

0 reaction

Alex pfp

0 reply

0 recast

0 reaction

Alex pfp

#daіlymonad https://madness.finance/exp?ref=WLWBYJ earned 200 points on the leaderboard

0 reply

0 recast

0 reaction

Alex pfp

Microsoft introduced a new video generation tool — Bing Video Creator! Now users of the Bing mobile application on Android and iPhone can create short video clips for free, using the text-video model from OpenAI called Sora. The launch of Bing Video Creator underscores Microsoft's desire to democratize AI technologies, allowing anyone to easily convert text into engaging videos. Users can generate up to three videos simultaneously and choose between standard and fast creation speed. This event adds a new dimension to the development of AI applications and reflects the growing interest in media formats.

0 reply

0 recast

1 reaction

Alex pfp

OpenAI has launched o3-pro for users with Pro subscription in ChatGPT and via API. It is slightly better than o3 in coding, analytics, science and writing. But not a coup to give $200. Therefore, we are waiting for them to roll out for Plus subscribers. Tests show that the model follows instructions better, generates more structured answers. And it makes up less - in the test it gave the correct answer to the same question four times in a row. Like o3, o3-pro supports all the tools that are available in ChatGPT. In addition, OpenAI has significantly reduced the price of the o3 model via API. And Sam Altman added that the open-scale model will not be released in June as planned. It has been postponed to later this summer, because they say something unexpected and very powerful awaits us in June. Is it GPT-5?

0 reply

0 recast

0 reaction

Alex pfp

ElevenLabs presented Eleven v3 (alpha) — the most expressive text voiceover model The most expressive voiceover model for today's text. It supports 70+ languages, multi-voice mode, and now — audio tags that set intonation, emotions, and even pauses in speech. New architecture better understands text and context, creating natural, "live" audio. What Eleven v3 can do: • Generate realistic dialog with multiple voices • Read emotional transitions • React to the context and change the tone during the speech The model is managed through tags: - Emotions: [sad], [angry], [happily] - Delivery: [whispers], [shouts] - Reactions: [laughs], [sighs], [clears throat] The public API is promised to be rolled out very soon. This is a preview version - it may require fine-tuning of the prompts. But the result is really impressive

0 reply

0 recast

1 reaction

Alex pfp

Apple presented at WWDC 2025 a new approach to operating system interfaces using Liquid Glass. This design language changes the appearance of application interfaces on devices. The update includes a new game application, improved multitasking on iPadOS and the integration of ChatGPT into the Apple Image Playground project, which provides a richer user experience. This significant development underscores Apple's commitment to advancements in the field of AI and opens up new opportunities for developers working with intelligence on devices.

0 reply

0 recast

0 reaction

Alex pfp

OpenAI has updated Codex — and now it’s even closer to a real AI developer. Here’s what’s been added: 1️⃣ Codex is now available for ChatGPT Plus. So far with loyal limits, but during high load, restrictions may appear to keep the model stable. 2️⃣ The most anticipated update: Codex can now go online while executing tasks — to set dependencies, run tests, pull resources, or update packages. Finally. 3️⃣ Internet access is disabled by default — you can enable it when creating or editing an environment. Full control over HTTP domains and methods — everything is clear. 4️⃣ This feature has been added to Plus, Pro, and Team users. Enterprise support is coming soon. 5️⃣ And a bunch of other useful little things: – now, if Codex is working on a task, it updates an existing Pull Request instead of creating a new one, etc.

0 reply

0 recast

6 reactions

Alex pfp

Amazon is taking a step forward in robotics and artificial intelligence by developing software for humanoid robots that will help deliver packages. The company is close to completing an “internal humanoid fleet” to test robots in real-world conditions, working in tandem with Rivian electric vehicles. This innovative approach aims to create versatile robots that can understand and respond to natural language commands, which will significantly improve Amazon's logistics operations.

0 reply

0 recast

4 reactions