Core Concepts

YosrAI is built on three pillars: Agent, Context, and Workflow.

1. Agent (`Agent`)

An Agent is an autonomous unit of execution. It has:

Identity: Defined by instructions (System Prompt).
Hands: A list of tools it can execute. Custom functions or standard kits from yosrai.toolkit.
Voice: An LLMClient (OpenAI, Ollama, Anthropic, Google) which supports Streaming.
Memory: A long-term storage (Memory) to recall past interactions.

It runs a ReAct Loop: Thought -> Action -> Observation -> Thought.

2. Context (`Context`)

The Context is the "Memory" or "Blackboard" of your application.

It stores Input (from user).
It stores State (variables shared between agents).
It is Serializable (can be saved to JSON and resumed).

3. Workflow (`Workflow`)

Workflows allow you to chain Agents together.

Sequential: A -> B -> C
Branching: A -> (if x) B else C
Looping: A -> (while x) B

flow = Workflow("StoryPipeline").start(writer).then(editor)

4. Conductor (`Conductor`)

The Conductor is a specialized Agent designed to orchestrate other Agents (called Skills). It acts as a manager that delegates tasks to experts.

Skills: Sub-agents or functions that the Conductor can call.
Planning: Can automatically generate an execution plan before starting.
Shared Memory: Can inject a shared memory instance into all skills for cross-agent knowledge.
Resilience: Skills can be configured with retries and timeouts.

from yosrai import Conductor, Agent, SkillConfig

# Create skills
researcher = Agent(name="Researcher", ...)
writer = Agent(name="Writer", ...)

# Configure skill with resilience
robust_researcher = SkillConfig(
    skill=researcher,
    retries=3,
    timeout=60.0
)

# Create Conductor
boss = Conductor(
    name="Boss",
    instructions="Manage the team.",
    skills=[robust_researcher, writer],
    planning=True,  # Enable planning mode
    shared_memory=shared_mem  # Share memory across all skills
)

boss.run("Research AI and write a post.")

5. Pipeline (`Pipeline`)

A Pipeline executes a sequence of skills linearly, where the output of one step becomes the input of the next.

from yosrai import Pipeline

# Define a linear process
pipeline = Pipeline(
    name="ContentFlow",
    steps=[researcher, summarizer, translator]
)

# Run it directly
result = pipeline.run("Topic: AI")

# Or use it as a skill in a Conductor
conductor = Conductor(..., skills=[pipeline])

6. Human-in-the-Loop (`HumanAgent`)

Sometimes AI needs supervision. The HumanAgent is a special agent that:

Pauses the workflow.
Displays the current context/output to the human.
Waits for human input (Approval, Feedback, or Override).
Resumes the workflow with the human's input injected into the context.

This turns the user into a participant in the agentic workflow.

7. Structured Output

Agents usually return text, but software needs objects. You can force an Agent to return a Pydantic model.

class Analysis(BaseModel):
    sentiment: str
    score: float

agent = Agent(..., response_model=Analysis)
result = agent.run("I love this!")
# result is an Analysis object, not a string.

This works with OpenAI (native beta.parse) and Ollama (JSON mode + Schema injection).

8. Toolkits

While Agents can use custom functions as tools, YosrAI provides Toolkits for common capabilities.

WebToolkit: Search the internet (DuckDuckGo).
FileToolkit: Read/Write files within a sandboxed directory.
SystemToolkit: Execute shell commands (use with caution).

from yosrai.toolkit import WebToolkit

# Equip agent with web search capability
agent = Agent(..., tools=WebToolkit().get_tools())

9. Async Support

For high-throughput applications, YosrAI is fully async-native. Both Agents and Workflows expose async methods (arun).

# Async Agent
response = await agent.arun("Hello")

# Async Workflow
result = await workflow.arun(Context(input="Start"))

10. Strict State (Typed Context)

For robust applications, you can enforce a schema on your Context using Pydantic.

from pydantic import BaseModel
from yosrai.engine import Context

class AppState(BaseModel):
    user_id: str
    score: int = 0

# Initialize with schema
ctx = Context(schema=AppState, user_id="u123")

# Valid assignment
ctx.score = 10

# Invalid assignment raises ValueError
# ctx.score = "ten"  # Error!

YosrAI agents can perceive more than just text. The Message and ContentBlock primitives allow you to send images to Vision-capable models.

from yosrai.engine import Message

# Send text and image together
msg = Message.user(
    "What is in this image?",
    images=["https://example.com/image.png"]
)

response = agent.run([msg])

Universal Support: Works with OpenAI, Anthropic, and Google.
Automatic Handling: URLs are automatically downloaded and converted to Base64 for providers that don't support URL inputs natively (like Anthropic).
Flexible Input: Supports HTTP URLs, Base64 strings, and local file paths.

12. Observability

YosrAI provides a structured event system to monitor your workflows. You can attach an EventManager to capture detailed traces.

from yosrai.engine import Workflow, EventType, EventManager

def on_start(event):
    print(f"Started run {event.run_id} at {event.timestamp}")

em = EventManager()
em.on(EventType.WORKFLOW_START, on_start)

flow = Workflow("MyFlow", event_manager=em)

13. Service Wrapper

Deploying your Agent or Workflow as a REST API is simple with the ServiceWrapper. It exposes standard endpoints (/run, /health, /info) using FastAPI.

from yosrai.engine import ServiceWrapper

service = ServiceWrapper(agent, title="My Service")
service.run(host="0.0.0.0", port=8000)

You can now access Swagger UI at http://localhost:8000/docs and execute your agent via POST /run.

14. Multi-Provider Support

YosrAI supports major LLM providers natively via an explicit Registry.

OpenAI: openai/gpt-4o
Anthropic: anthropic/claude-3-5-sonnet (requires yosrai[anthropic])
Google: google/gemini-1.5-pro (requires yosrai[google])
Ollama: ollama/llama3 (Local)

The provider is selected automatically based on the model string prefix.

15. Semantic Memory (RAG)

Give your agents a "Hippocampus" to remember facts and documents using Embeddings and Vector Search.

Embedders: OpenAIEmbedder, OllamaEmbedder.
Memory: LocalVectorMemory for lightweight, file-backed semantic search without heavy vector DBs.

from yosrai.engine.memory import LocalVectorMemory
from yosrai.engine.embeddings import OpenAIEmbedder

mem = LocalVectorMemory("brain.json", OpenAIEmbedder())
agent = Agent(..., memory=mem)

16. Blueprint System

YosrAI agents and workflows can be serialized to JSON blueprints for external tools and hosting.

Agent Blueprints

from yosrai import Agent

# Create and export blueprint
agent = Agent(name="Researcher", instructions="Research topics", tools=[search_tool])
blueprint = agent.to_blueprint()  # Returns JSON-compatible dict

# Reconstruct from blueprint
reconstructed = Agent.from_blueprint(blueprint)

Workflow Blueprints

from yosrai import Workflow, Agent

# Create workflow
flow = Workflow("Pipeline").start(agent1).then(agent2)
blueprint = flow.to_blueprint()

# Reconstruct complete workflow
reconstructed_flow = Workflow.from_blueprint(blueprint, agents={"agent1": agent1, "agent2": agent2})

Use Cases: Visual builders, version control, deployment, testing.

17. Export Formats

The engine can express itself in multiple formats for learning and integration.

Code Generation

Generate idiomatic YosrAI Python code:

# Export agent as code
code = agent.to_code()
print(code)  # Complete Python code to recreate the agent

# Export workflow as code
workflow_code = workflow.to_code()

Visual Diagrams

Generate Mermaid flowchart diagrams:

# Generate visual flowchart
diagram = workflow.to_mermaid()
print(diagram)  # Mermaid markdown that can be rendered

Blueprint Execution

Run agents/workflows directly from JSON via API:

from yosrai import ServiceWrapper

# Service can execute blueprints directly
service = ServiceWrapper(Agent(name="Placeholder", instructions="..."))
# POST /run-blueprint with blueprint JSON

18. Transparent Execution

Watch the engine think with verbose mode and execution replay.

Verbose Mode

Real-time step-by-step output during execution:

agent = Agent(..., verbose=True)
agent.run("Hello")
# Output: 🚀 Agent started
#         💭 Thinking... done
#         ✅ Agent → Hello! How can I help?

Execution Replay

Capture and review execution with timing:

# Capture execution
result = agent.run("Query", trace=True)

# Replay with timing
result.replay()

# Access execution data
print(f"Duration: {result.duration_ms}ms")
print(f"Events: {len(result.events)}")
trace = result.to_dict()  # Export as JSON

Educational Errors

Helpful error messages with suggestions:

try:
    agent.run("...")
except YosraiError as e:
    print(e)  # Shows message + suggestion + doc link
    e.rich_print()  # Fancy formatted output

19. Hosting & Service Deployment

Deploy agents and workflows as REST APIs with automatic OpenAPI documentation.

Basic Service

from yosrai import ServiceWrapper

service = ServiceWrapper(agent, title="My Agent API")
service.run(port=8000)

API Endpoints

GET /health - Health check
GET /info - Service metadata + blueprint export
POST /run - Execute agent/workflow
POST /run-blueprint - Execute from JSON blueprint
GET /docs - Swagger UI (auto-generated)
GET /openapi.json - OpenAPI specification

Request/Response Format

// POST /run
{
  "input": "Hello world",
  "inputs": {"custom": "data"}
}

// Response
{
  "output": "Hello! How can I help you today?"
}

20. Thinking Mode

Enable chain-of-thought reasoning to improve the accuracy and depth of agent responses.

agent = Agent(..., thinking=True)
result = agent.run("Solve this complex puzzle")

When enabled, the agent is instructed to think step-by-step and wrap its internal reasoning in <thinking> tags before providing the final answer.

21. Self-Reflection Pattern

Enable a built-in self-critique loop to refine and improve outputs automatically.

agent = Agent(..., reflection=True)
result = agent.run("Write a secure login function in Python")

When enabled, the agent will: 1. Produce an initial response. 2. Critique its own response for errors or improvements. 3. Automatically refine the answer based on the critique.

22. Text Chunking (RAG Infrastructure)

Prepare large documents for semantic search by splitting them into manageable, context-aware pieces.

from yosrai.engine.utils import chunk_text, chunk_content

# Basic string chunking
chunks = chunk_text(large_text, size=1000, overlap=100, strategy="sentence")

# Metadata-aware chunking for RAG
rich_chunks = chunk_content(
    large_text, 
    metadata={"source": "manual.pdf"},
    strategy="paragraph"
)

Chunking Strategies:

basic: Split by character count.
sentence: Split at sentence boundaries (., !, ?).
paragraph: Split at paragraph boundaries (\n\n).

Metadata Preservation:

chunk_content automatically attaches the original metadata, plus chunk_index and total_chunks, to every chunk, ensuring perfect attribution in retrieval results.

23. Cost Tracking

Monitor LLM usage and costs across all providers with built-in pricing tables and real-time tracking.

Usage Tracking

from yosrai import Agent, Usage

agent = Agent(name="Bot", instructions="Helpful assistant")

# Automatic usage tracking
result = agent.run("Hello world")
usage = agent.last_usage

print(f"Prompt tokens: {usage.prompt_tokens}")
print(f"Completion tokens: {usage.completion_tokens}")
print(f"Total tokens: {usage.total_tokens}")
print(f"Cost: ${usage.cost:.4f}")

Cost Calculation

from yosrai import calculate_cost, register_model_pricing

# Built-in pricing for major providers
cost = calculate_cost("openai/gpt-4o", prompt_tokens=100, completion_tokens=50)
print(f"Cost: ${cost:.4f}")

# Add custom pricing for private models
register_model_pricing("my-private-model", input_price=0.001, output_price=0.002)

Workflow Cost Aggregation

from yosrai import Workflow, Agent

agent1 = Agent.from_preset("researcher")
agent2 = Agent.from_preset("writer")

workflow = Workflow("Pipeline").start(agent1).then(agent2)
result = workflow.run("Create an article about AI")

print(f"Total workflow cost: ${workflow.last_usage.cost:.4f}")

24. Agent Presets

Pre-configured agents for common patterns. Get started instantly without manual configuration.

Available Presets

from yosrai import Agent

# Specialized agents for common tasks
researcher = Agent.from_preset("researcher")    # Deep investigation
writer = Agent.from_preset("writer")           # Creative content
critic = Agent.from_preset("critic")           # Analysis & feedback
coder = Agent.from_preset("coder")             # Code generation
planner = Agent.from_preset("planner")         # Task planning
summarizer = Agent.from_preset("summarizer")   # Concise summaries
assistant = Agent.from_preset("assistant")     # General help
translator = Agent.from_preset("translator")   # Language translation

Custom Presets

from yosrai import Agent, register_agent_preset

# Create custom preset
custom_config = {
    "instructions": "You are a financial analyst specializing in crypto markets.",
    "model": "openai/gpt-4o",
    "temperature": 0.1,
    "thinking": True,
    "reflection": True
}

register_agent_preset("crypto_analyst", custom_config)
analyst = Agent.from_preset("crypto_analyst")

Preset Discovery

from yosrai import list_agent_presets

# List all available presets
presets = list_agent_presets()
for preset in presets:
    print(f"{preset['name']}: {preset['description']}")

25. CLI Tool

Command-line interface transforming YosrAI from library to tool. Enable scripting, CI/CD, and rapid prototyping.

Blueprint Validation

# Validate single blueprint
yosrai validate agent.json

# Validate entire directory (CI/CD ready)
yosrai validate blueprints/ --recursive --json

# Exit codes for automation
# 0 = valid, 1 = invalid, 2 = error

Blueprint Execution

# Run agent from blueprint
yosrai run agent_blueprint.json --input "Hello world"

# Run workflow with custom inputs
yosrai run workflow.json --inputs-file data.json --json

# Multiple input formats supported
yosrai run agent.json --inputs-json '{"query": "test"}'

Diagram Generation

# Generate Mermaid diagrams
yosrai diagram workflow.json
yosrai diagram workflow.json -o docs/diagram.md

Interactive Chat

# Chat with preset agents
yosrai chat --preset researcher
yosrai chat --model openai/gpt-4o

# Persistent conversations
yosrai chat --save-session chat.json
yosrai chat --load-session chat.json

# Commands within chat:
/status  # Show conversation statistics
/clear   # Reset conversation
/save    # Save session
/help    # Show commands

Scaffolding

# Create new agents and workflows
yosrai new agent MyAgent --preset researcher
yosrai new workflow ContentPipeline
yosrai new list-presets

26. Conversation Mode

Natural multi-turn conversations with automatic history management and cost tracking.

Basic Conversations

from yosrai import Agent

agent = Agent.from_preset("assistant")

# Context manager pattern
with agent.conversation() as chat:
    response1 = chat.send_message("Hello! My name is Alice.")
    response2 = chat.send_message("What's my name?")
    response3 = chat.send_message("Tell me a joke.")

    print(f"Conversation cost: ${chat.total_usage.cost:.4f}")

Conversation Persistence

from yosrai import Conversation

# Save conversation to file
with agent.conversation(session_id="tutorial") as chat:
    chat.send_message("Explain recursion")
    chat.send_message("Give a Python example")

    chat.save_to_file("recursion_tutorial.json")

# Resume conversation later
loaded = Conversation.load_from_file("recursion_tutorial.json", agent)
with loaded:
    response = loaded.send_message("Now explain it with a different example")

Advanced Conversation Features

# Custom system prompt override
with agent.conversation(system_prompt="You are a pirate. Speak like one!") as chat:
    response = chat.send_message("How are you?")
    print(response)  # "Arrr, I'm doin' fine, matey!"

# Initial context variables
with agent.conversation(user_name="Dr. Smith", topic="physics") as chat:
    chat.send_message("Hello!")
    # Agent knows user's name and preferred topic

# History limits for long conversations
with agent.conversation(max_history=20) as chat:
    # History automatically trimmed to prevent token limit issues
    pass

Conversation Analysis

with agent.conversation() as chat:
    # ... conversation ...
    summary = chat.get_summary()

    print(f"Session: {summary['session_id']}")
    print(f"Messages: {summary['total_messages']}")
    print(f"Total cost: ${summary['total_cost']:.4f}")

CLI Integration

# Interactive conversation via CLI
yosrai chat --preset assistant --save-session conversation.json

# Load and continue previous conversation
yosrai chat --load-session conversation.json

Core Concepts

1. Agent (Agent)

2. Context (Context)

3. Workflow (Workflow)

4. Conductor (Conductor)

5. Pipeline (Pipeline)

6. Human-in-the-Loop (HumanAgent)

7. Structured Output

8. Toolkits

9. Async Support

10. Strict State (Typed Context)

11. Multi-Modal Interaction

12. Observability

13. Service Wrapper

14. Multi-Provider Support

15. Semantic Memory (RAG)

16. Blueprint System

Agent Blueprints

Workflow Blueprints

17. Export Formats

Code Generation

Visual Diagrams

Blueprint Execution

18. Transparent Execution

Verbose Mode

Execution Replay

Educational Errors

19. Hosting & Service Deployment

Basic Service

API Endpoints

Request/Response Format

20. Thinking Mode

21. Self-Reflection Pattern

22. Text Chunking (RAG Infrastructure)

Chunking Strategies:

Metadata Preservation:

23. Cost Tracking

Usage Tracking

Cost Calculation

Workflow Cost Aggregation

24. Agent Presets

Available Presets

Custom Presets

Preset Discovery

25. CLI Tool

Blueprint Validation

Blueprint Execution

Diagram Generation

Interactive Chat

Scaffolding

26. Conversation Mode

Basic Conversations

Conversation Persistence

Advanced Conversation Features

Conversation Analysis

CLI Integration

1. Agent (`Agent`)

2. Context (`Context`)

3. Workflow (`Workflow`)

4. Conductor (`Conductor`)

5. Pipeline (`Pipeline`)

6. Human-in-the-Loop (`HumanAgent`)