👾 Stop Wasting Hours on Code: Level Up with 2 AI Agent Hacks

Level up your AI agents with gpt-image-1 API, realistic AI voices, and pro coding tactics like test-driven development and memory bank. Read more:

Agents Made Simple Newsletter

Welcome to Edition #4 of Agents Made Simple

AI agent capabilities are improving from week to week. Several factors influence the performance of an agent system. Examples are the quality of the available tools, the latency (how long it takes the system to return a response), and its accuracy (how well it can retrieve context). This issue introduces solutions for adding high-quality image tools, realistic voices, and enhanced accuracy to your agents.

This week’s topics:

  • Microsoft’s new AI agents

  • Use ChatGPT’s AI image generator via API

  • New super realistic AI voices

  • How AI agents level up software development

  • No-code software prototyping with Lovable 2.0

  • Plus AI investments, community highlights, trending AI tools, and more resources

AI Agent News Roundup

💥 Breakthroughs

Microsoft copilot with researcher and analyst agents

Source: Microsoft

Microsoft released two reasoning agents: Analyst and Researcher. They analyze your entire work data and perform deep research based on OpenAI’s o3 deep research model, plus data enrichment through web search. They also announced a platform to easily create, manage, and deploy agents for business needs. They can also leverage third-party data via connectors.

Photograph of Sam Altman, excited, in a formal business suit holding an oversized gift box in both of his hands and towards the camera. The box is wrapped in red paper with golden ribbons.

Source: MadeByAgents / ChatGPT

OpenAI launched the gpt-image-1 API, allowing developers to integrate high-quality image generation into their applications directly. This API is based on the same native multimodal model used in ChatGPT, enabling the creation of images across diverse styles, accurate text rendering, and precise instruction following.

A cinematic photograph capturing a close-up of the advanced robot from "iRobot", its metallic face subtly reflecting the surrounding environment. The robot's head is tilted slightly as if engaged in conversation, its blue LED eyes glowing with intelligent curiosity, and its mouth moving as it utters "Can I help you?". Soft, diffused lighting highlights the robot’s sleek design and the intricate details of its facial features, set against a blurred background of a futuristic office space with glowing screens and metallic surfaces. The 16:9 aspect ratio emphasizes the robot's imposing presence and cinematic grandeur.

Source: MadeByAgents / Ideogram

Open-source text-to-speech models seem to overtake leading closed-source models from ElevenLabs and Sesame. New abilities: Natural intonation, emotion, and rhythm. Voice cloning without prior fine-tuning. Control speech and emotion with simple tags. Low latency of ~200ms for real-time streaming. Examples: Canopy Labs Orpheus TTS and Nari Labs Dia.

📈 Investments1

🇺🇸 Elon Musk is seeking to raise $25B+ in new capital for his xAI & X venture that could place the company at a total valuation of $200B.

🇺🇸 Nvidia released its NeMo tools, allowing enterprises to integrate AI agents with their data, increasing accuracy. Nvidia describes AI agents as a trillion-dollar opportunity that will boost employee productivity enormously over the coming years.

🇨🇳 Chinese chipmaker Huawei is planning to ship its 910C GPU as early as May. Chinese AI firms want to use it as their main AI chip, reducing dependence on Nvidia.

🇨🇳 BMW wants to integrate DeepSeek AI in its new vehicles in China later this year.

🇨🇳 Chinese President Xi Jinping declared AI self-sufficiency as a top priority amid U.S. rivalry.

My take: The current tariff war has led to high volatility on the capital markets and corrections. It’s a gift, with valuations of blue-chip companies crumbling. Great chances appear to acquire shares cheaply ahead of the coming AI bull market.

Level Up Development with AI Agents

One of the main applications of AI agents is writing software. One could say we are almost at the point where AI agents are writing themselves. But without constantly providing context about your codebase, you spend more time fixing and correcting the AI than writing the code yourself.

I want to share two new pro tactics for leveraging AI for coding, I learnt from Jason Zhou (link to original post below under community highlights)

Building Reliable Code with AI: Test-Driven Development

Creating complex functions with AI coding tools can be hard. Often, the AI doesn't get it right the first time. You might face errors. You feed these errors back to the AI. It tries to fix the code. But sometimes this causes more errors. You become a messenger between the testing results and the AI. This can be frustrating. The AI might fix one part but break another.

A helpful approach is Test-Driven Development (TDD). This is a software process. You write tests for functions or features before writing the code itself. You define what input the function expects. You define the expected output. Then, you write just enough code to pass the test. You improve the code while making sure all tests still pass. The test tells you if it passed or not.

This works well with AI tools like Cursor. It helps you align requirements with the AI. More importantly, Cursor can run command-line tests by itself. You can use features like 'yolo mode' or auto-run. You ask it to write a test. Then write the function. Then run the test. The AI iterates on the function based on the test results until all tests pass.

Example prompt to illustrate TDD in Cursor

Example prompt to illustrate the TDD approach

Giving AI a Memory: The Memory Bank Concept

When using AI coding agents for large or complex projects, they often lack context. They don't remember what has been done. They forget the implementation choices made earlier. They might not understand the current task fully. When you ask the agent to add new features, it might not have enough context. This can lead to it messing up existing files.

A concept called Memory Bank helps solve this. It was first introduced by an open-source tool called Cline. The idea is that the AI agent can have memory. It remembers past tasks. It remembers decisions made. It knows the latest task it is working on.

How does this work? It's not magic. It involves defining specific files. These files store different kinds of project context.

  • A project brief defines core goals and requirements.

  • Product context covers user experience and how things should work.

  • Active context holds the current focus, recent changes, decisions, and patterns.

  • System patterns store architecture and key technical decisions.

  • Tech context includes the tech stack, constraints, and dependencies.

  • Progress MD tracks what works and what is left.

Flowchart illustration how the memory bank works

How the memory bank works

These files act as a central context repository. The AI coding agent is taught how and when to use these files. Before building new features, it first reads from this memory bank. It gets a quick download of the project. It understands what has been done. It learns from implementation decisions made previously. This provides enough context. The AI can then build new features on top of what exists without messing up the project.

This method works for new projects started from scratch. It also works for existing projects. The AI agent can look at your existing project files. It can understand what is going on. It can then generate these memory bank files to gain context.

Cursor’s New Features to Level Up Development

In Cursor, this memory bank concept can be adopted using the custom mode feature. Cursor silently rolled out custom modes which is currently in beta. You can enable it in Cursor Settings → Features → Chat → Custom modes.

You can paste in specific instructions that teach the AI to use and update memory bank files. There's also a more advanced project called Cursor Memory Bank. This project pushes the idea further. It uses different custom modes for structured workflows. These modes can include:

  • Van mode for initial setup and understanding the existing project.

  • Plan mode for breaking down tasks, like an engineer manager.

  • Creative mode for thinking, exploration, and debugging. It can help figure out potential root causes for bugs.

  • Build mode for actually implementing the functionality.

Flowchart illustrating how the memory bank with custom modes in Cursor works

Concept of the memory bank system in Cursor with custom modes

Using these modes and the associated memory bank files helps the AI load relevant rules dynamically. It keeps track of progress and context across chats and tasks. This provides a much better starting point for the AI when you want it to implement new features in an existing project.

Tool Spotlight

Lovable chat UI

Source: Lovable

Lovable now lets you build apps faster with a smarter Chat Mode Agent, collaborate in real time with team workspaces, and scan your projects for security issues.

The UI got a fresh new look. Plus features like Dev Mode, visual editing, and custom domains make building production-ready apps even smoother.

To give you an example, I prompted it to build a Pomodoro-style productivity app with AI summaries:

CONTEXT

You are building “Focus Booster,” a lightweight Pomodoro-style productivity app for teams. It lets users start timed focus sessions, log what they worked on, and get an AI-generated “Daily Wins” summary at the end of the day.

GOAL

Deliver a production-ready web app where users can:
1. Start/stop 25 min focus timers.
2. Record brief notes per session.
3. View a history of sessions.
4. Receive an end-of-day AI summary of accomplishments and insights.

FEATURES, TECH STACK, WORKFLOW, SECURITY

Part of the detailed input prompt (if you are unsure about the tech stack and other details of your app, simply ask ChatGPT to help you come up with the missing parts.

Hit send and watch it code the app for you.

Focus timer app built in Lovable

Prototype of the focus booster app

You can continue prompting it to add more features or fix bugs, and test it right away in the preview.

Non-technical people usually hit a roadblock with these tools sooner or later in the process, especially when it’s a more complex app. In my example, I still had to connect the APIs manually for Supabase, OpenAI, SendGrid, and Slack to make it work.

Community Highlights

Made By Agents Updates

🏞️ Krea: AI-powered creative platform that enables users to generate and edit images using text prompts and real-time feedback.

🔀 Openrouter: Unified API gateway that allows developers to access and switch between various LLMs like OpenAI, Anthropic, and Mistral through a single interface.

💚 Lovable: AI-driven software engineering platform that empowers non-coders to build complete tech products using natural language prompts.

More Resources

Blog: AI-driven business automation and practical strategies for growth
AI Tool Collection: Discover and compare the perfect AI solutions
Consultancy: I help you solve your problem or discover AI potential
Follow along on YouTube, X, LinkedIn, and Instagram

See you next time!

Tobias from MadeByAgents

Tobias - Founder of MadeByAgents

Tobias

P.S. Was this useful? Have ideas on what I should publish next? Tap the poll or reply to this email. I read every response.

How did you like the newsletter?

Login or Subscribe to participate in polls.

1 Disclaimer: The information shared reflects my personal opinions and is for informational purposes only. It is not financial advice, and you should consult a qualified professional before making any decisions.

Reply

or to participate.