Automating Lead Generation for Local Businesses with AI Agents

Tech Stack

Python
LangChain
OpenAI
React
Web Scraping
BeautifulSoup
Selenium
MongoDB

Built AI-powered lead generation system using LangChain agents that scrapes business directories, enriches data with ML lead scoring, and generates GPT-4 personalized emails. Processed 62,000+ prospects achieving 8.2% response rate (4x industry average). Delivered 300% growth for FitCheck, $2k+ revenue for Workwear, and 2x ROI for Gloss Authority across 15+ clients.

Live Platform

Local businesses waste 15-20 hours per week manually prospecting—scraping directories, researching companies, crafting cold emails—only to get 2-3% response rates. For Lume (District Four), a digital marketing agency serving 15+ local businesses, this manual process couldn't scale.

I built an AI agent system using LangChain that automates the entire lead generation pipeline: scraping business directories, enriching data with web research, scoring leads with ML, and generating hyper-personalized outreach emails using GPT-4. The system processes 5,000+ prospects monthly and achieved a 3x increase in client acquisition rate.

Here's how I architected an agentic AI system that turned cold outreach from a time sink into a revenue driver—delivering 300% user growth for FitCheck, $2k+ monthly revenue for Workwear, and 2x ROI for Gloss Authority.

The Problem: Manual Lead Gen Doesn't Scale

Traditional Lead Generation is Broken

Local businesses (restaurants, gyms, salons, boutiques) need consistent customer acquisition, but:

For Lume's 15 clients, this meant:

The opportunity: Automate with AI agents to scale to 1000s of prospects while maintaining personalization.

Architecture

┌─────────────────────────────────────────────────────────────┐
│            AI Agent Orchestration (LangChain)                │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐      │
│  │  Scraper     │  │  Enrichment  │  │  Outreach    │      │
│  │  Agent       │→ │  Agent       │→ │  Agent       │      │
│  └──────────────┘  └──────────────┘  └──────────────┘      │
└────────────────────────┬────────────────────────────────────┘
                         ↓
┌─────────────────────────────────────────────────────────────┐
│                    Data Pipeline                             │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐      │
│  │  Business    │  │  Website     │  │  Social      │      │
│  │  Directories │  │  Scraper     │  │  Media API   │      │
│  │ (Yelp, GMaps)│  │              │  │ (Instagram)  │      │
│  └──────────────┘  └──────────────┘  └──────────────┘      │
└────────────────────────┬────────────────────────────────────┘
                         ↓
┌─────────────────────────────────────────────────────────────┐
│                Lead Scoring & Enrichment                     │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐      │
│  │  ML Scoring  │  │  GPT-4       │  │  Email       │      │
│  │  Model       │  │  Summary     │  │  Validation  │      │
│  └──────────────┘  └──────────────┘  └──────────────┘      │
└────────────────────────┬────────────────────────────────────┘
                         ↓
┌─────────────────────────────────────────────────────────────┐
│                  Email Outreach Engine                       │
│  - GPT-4 personalized email generation                      │
│  - A/B testing (5 variants per campaign)                    │
│  - Follow-up sequence (3 emails, 7-day cadence)            │
│  - SendGrid API integration                                  │
└────────────────────────┬────────────────────────────────────┘
                         ↓
                MongoDB (Leads Database)
                + React Dashboard (Client Portal)

Implementation

1. LangChain Agent Orchestration

Built a multi-agent system where specialized agents handle different tasks:

# agents/lead_generation_agent.py
from langchain.agents import AgentExecutor, create_openai_functions_agent
from langchain_openai import ChatOpenAI
from langchain.tools import Tool
from langchain.prompts import ChatPromptTemplate
 
class LeadGenerationAgent:
    """
    Multi-agent system for automated lead generation
    
    Agents:
    1. Scraper Agent - Find prospects from directories
    2. Enrichment Agent - Research and score leads
    3. Outreach Agent - Generate personalized emails
    """
    
    def __init__(self):
        self.llm = ChatOpenAI(model="gpt-4-turbo-preview", temperature=0.7)
        
        # Initialize sub-agents
        self.scraper_agent = self._create_scraper_agent()
        self.enrichment_agent = self._create_enrichment_agent()
        self.outreach_agent = self._create_outreach_agent()
    
    def _create_scraper_agent(self) -> AgentExecutor:
        """
        Agent that scrapes business directories
        
        Tools:
        - search_yelp: Find businesses on Yelp
        - search_google_maps: Find businesses on Google Maps
        - extract_contact_info: Parse contact details from websites
        """
        tools = [
            Tool(
                name="search_yelp",
                func=self.search_yelp,
                description="Search Yelp for businesses in a specific category and location"
            ),
            Tool(
                name="search_google_maps",
                func=self.search_google_maps,
                description="Search Google Maps for businesses"
            ),
            Tool(
                name="extract_contacts",
                func=self.extract_contact_info,
                description="Extract email and phone from business website"
            )
        ]
        
        prompt = ChatPromptTemplate.from_messages([
            ("system", """You are a business research agent. Your job is to:
            1. Search directories for businesses matching the target criteria
            2. Extract complete contact information
            3. Validate that businesses are operational
            4. Return a structured list of prospects"""),
            ("human", "{input}"),
            ("placeholder", "{agent_scratchpad}"),
        ])
        
        agent = create_openai_functions_agent(self.llm, tools, prompt)
        return AgentExecutor(agent=agent, tools=tools, verbose=True)
    
    def _create_enrichment_agent(self) -> AgentExecutor:
        """
        Agent that enriches lead data with research
        
        Tools:
        - scrape_website: Extract key info from business website
        - check_social_media: Get social media presence
        - analyze_reviews: Summarize customer sentiment
        - score_lead: Calculate lead quality score
        """
        tools = [
            Tool(
                name="scrape_website",
                func=self.scrape_website,
                description="Scrape and summarize a business website"
            ),
            Tool(
                name="check_social_media",
                func=self.check_social_media,
                description="Check Instagram, Facebook presence and follower count"
            ),
            Tool(
                name="analyze_reviews",
                func=self.analyze_reviews,
                description="Analyze Google/Yelp reviews for pain points"
            ),
            Tool(
                name="score_lead",
                func=self.score_lead,
                description="Score lead quality (0-100)"
            )
        ]
        
        prompt = ChatPromptTemplate.from_messages([
            ("system", """You are a lead enrichment agent. Your job is to:
            1. Research each prospect thoroughly
            2. Identify their pain points and opportunities
            3. Score lead quality based on criteria
            4. Provide actionable insights for outreach"""),
            ("human", "{input}"),
            ("placeholder", "{agent_scratchpad}"),
        ])
        
        agent = create_openai_functions_agent(self.llm, tools, prompt)
        return AgentExecutor(agent=agent, tools=tools, verbose=True)
    
    def _create_outreach_agent(self) -> AgentExecutor:
        """
        Agent that generates personalized outreach
        
        Tools:
        - generate_email: Create personalized email
        - generate_subject: Create compelling subject line
        - schedule_followup: Create follow-up sequence
        """
        tools = [
            Tool(
                name="generate_email",
                func=self.generate_personalized_email,
                description="Generate personalized cold email based on research"
            ),
            Tool(
                name="generate_subject",
                func=self.generate_subject_line,
                description="Generate attention-grabbing subject line"
            ),
            Tool(
                name="schedule_followup",
                func=self.schedule_followup_sequence,
                description="Create 3-email follow-up sequence"
            )
        ]
        
        prompt = ChatPromptTemplate.from_messages([
            ("system", """You are an expert copywriter. Your job is to:
            1. Write hyper-personalized cold emails that convert
            2. Reference specific details about the prospect
            3. Highlight relevant case studies and results
            4. Create compelling subject lines
            5. Follow proven cold email frameworks (AIDA, PAS)"""),
            ("human", "{input}"),
            ("placeholder", "{agent_scratchpad}"),
        ])
        
        agent = create_openai_functions_agent(self.llm, tools, prompt)
        return AgentExecutor(agent=agent, tools=tools, verbose=True)
    
    async def generate_leads(
        self,
        business_type: str,
        location: str,
        count: int = 100
    ) -> List[Dict]:
        """
        Main pipeline: Scrape → Enrich → Generate outreach
        
        Args:
            business_type: "restaurant", "gym", "salon", etc.
            location: "New York, NY"
            count: Number of leads to generate
        
        Returns:
            List of enriched leads with outreach emails
        """
        # Step 1: Scrape prospects
        print(f"🔍 Scraping {count} {business_type} businesses in {location}...")
        prospects = await self.scraper_agent.ainvoke({
            "input": f"Find {count} {business_type} businesses in {location}. "
                    f"Extract name, address, phone, email, website."
        })
        
        # Step 2: Enrich each prospect
        print(f"📊 Enriching {len(prospects)} prospects...")
        enriched_leads = []
        
        for prospect in prospects:
            enrichment = await self.enrichment_agent.ainvoke({
                "input": f"Research {prospect['name']} ({prospect['website']}). "
                        f"Analyze their website, social media, and reviews. "
                        f"Identify pain points and score lead quality."
            })
            
            enriched_leads.append({
                **prospect,
                **enrichment,
                'enriched_at': datetime.utcnow()
            })
        
        # Step 3: Generate outreach for high-quality leads
        print(f"✉️  Generating outreach emails...")
        qualified_leads = [lead for lead in enriched_leads if lead['score'] >= 70]
        
        for lead in qualified_leads:
            outreach = await self.outreach_agent.ainvoke({
                "input": f"Create personalized cold email for {lead['name']}. "
                        f"Pain points: {lead['pain_points']}. "
                        f"Their website: {lead['website_summary']}. "
                        f"Our case study: FitCheck achieved 300% user growth."
            })
            
            lead['email_content'] = outreach['email']
            lead['subject_line'] = outreach['subject']
            lead['followup_sequence'] = outreach['followups']
        
        return qualified_leads
 
# Tool implementations
def search_yelp(self, business_type: str, location: str) -> List[Dict]:
    """Scrape Yelp for businesses"""
    # Implementation with Yelp API or web scraping
    pass
 
def scrape_website(self, url: str) -> Dict:
    """Scrape and summarize business website"""
    # Implementation with BeautifulSoup + GPT-4
    pass
 
def generate_personalized_email(self, lead_data: Dict) -> str:
    """Generate personalized cold email with GPT-4"""
    # Implementation below
    pass

2. Web Scraping Pipeline

Scrape business directories with rotating proxies and anti-detection:

# scrapers/business_scraper.py
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from bs4 import BeautifulSoup
import requests
from typing import List, Dict
import time
import random
 
class BusinessScraper:
    """
    Scrape business information from directories
    
    Supports:
    - Yelp (business name, category, address, phone, website, reviews)
    - Google Maps (same as above + hours, photos)
    - Yellow Pages
    """
    
    def __init__(self, use_proxy: bool = True):
        options = webdriver.ChromeOptions()
        options.add_argument('--headless')
        options.add_argument('--disable-blink-features=AutomationControlled')
        options.add_argument('user-agent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)')
        
        if use_proxy:
            options.add_argument(f'--proxy-server={self._get_proxy()}')
        
        self.driver = webdriver.Chrome(options=options)
        self.wait = WebDriverWait(self.driver, 10)
    
    def scrape_yelp(
        self,
        category: str,
        location: str,
        limit: int = 100
    ) -> List[Dict]:
        """
        Scrape Yelp for businesses
        
        Example:
            scraper.scrape_yelp("restaurants", "New York, NY", 100)
        """
        businesses = []
        page = 0
        
        while len(businesses) < limit:
            # Construct search URL
            url = f"https://www.yelp.com/search?find_desc={category}&find_loc={location}&start={page * 10}"
            
            self.driver.get(url)
            
            # Random delay to avoid detection
            time.sleep(random.uniform(2, 5))
            
            # Parse results
            soup = BeautifulSoup(self.driver.page_source, 'html.parser')
            results = soup.find_all('div', class_='arrange-unit__09f24__rqHTg')
            
            for result in results:
                if len(businesses) >= limit:
                    break
                
                try:
                    business = self._parse_yelp_listing(result)
                    if business:
                        businesses.append(business)
                except Exception as e:
                    print(f"Error parsing listing: {e}")
                    continue
            
            # Check if there are more pages
            if not self._has_next_page(soup):
                break
            
            page += 1
        
        return businesses
    
    def _parse_yelp_listing(self, element) -> Dict:
        """Extract structured data from Yelp listing"""
        name = element.find('a', class_='css-19v1rkv').text if element.find('a', class_='css-19v1rkv') else None
        
        if not name:
            return None
        
        # Extract rating
        rating_elem = element.find('div', {'aria-label': lambda x: x and 'star rating' in x})
        rating = float(rating_elem['aria-label'].split()[0]) if rating_elem else 0
        
        # Extract review count
        review_elem = element.find('span', class_='css-chan6m')
        reviews = int(review_elem.text.split()[0]) if review_elem else 0
        
        # Extract categories
        categories = [cat.text for cat in element.find_all('a', class_='css-11bijt4')]
        
        # Extract neighborhood
        neighborhood_elem = element.find('span', class_='css-1p9ibgf')
        neighborhood = neighborhood_elem.text if neighborhood_elem else None
        
        return {
            'name': name,
            'rating': rating,
            'review_count': reviews,
            'categories': categories,
            'neighborhood': neighborhood,
            'source': 'yelp'
        }
    
    def scrape_google_maps(
        self,
        query: str,
        location: str,
        limit: int = 100
    ) -> List[Dict]:
        """Scrape Google Maps for businesses"""
        url = f"https://www.google.com/maps/search/{query}+in+{location}"
        
        self.driver.get(url)
        time.sleep(3)
        
        # Scroll to load more results
        results_div = self.driver.find_element(By.CLASS_NAME, 'feed-view')
        
        for _ in range(limit // 10):
            self.driver.execute_script(
                'arguments[0].scrollTop = arguments[0].scrollHeight',
                results_div
            )
            time.sleep(2)
        
        # Parse results
        soup = BeautifulSoup(self.driver.page_source, 'html.parser')
        listings = soup.find_all('div', class_='Nv2PK')
        
        businesses = []
        for listing in listings[:limit]:
            business = self._parse_gmaps_listing(listing)
            if business:
                businesses.append(business)
        
        return businesses
    
    def _parse_gmaps_listing(self, element) -> Dict:
        """Extract data from Google Maps listing"""
        # Extract name
        name_elem = element.find('div', class_='qBF1Pd')
        name = name_elem.text if name_elem else None
        
        if not name:
            return None
        
        # Extract rating
        rating_elem = element.find('span', class_='MW4etd')
        rating = float(rating_elem.text) if rating_elem else 0
        
        # Extract address
        address_elem = element.find('div', class_='W4Efsd')
        address = address_elem.text if address_elem else None
        
        # Extract phone
        phone_elem = element.find('span', class_='UsdlK')
        phone = phone_elem.text if phone_elem else None
        
        # Extract website
        website_elem = element.find('a', {'data-value': 'Website'})
        website = website_elem['href'] if website_elem else None
        
        return {
            'name': name,
            'rating': rating,
            'address': address,
            'phone': phone,
            'website': website,
            'source': 'google_maps'
        }
    
    def enrich_with_website_data(self, business: Dict) -> Dict:
        """
        Visit business website and extract additional data
        
        Extracts:
        - Email addresses
        - Social media links
        - About/description
        - Services offered
        """
        if not business.get('website'):
            return business
        
        try:
            response = requests.get(business['website'], timeout=10)
            soup = BeautifulSoup(response.content, 'html.parser')
            
            # Extract emails
            emails = self._extract_emails(soup)
            business['emails'] = emails
            
            # Extract social media
            social = self._extract_social_links(soup)
            business['social_media'] = social
            
            # Extract description using GPT-4
            description = self._summarize_website(soup.get_text())
            business['description'] = description
            
        except Exception as e:
            print(f"Error enriching {business['website']}: {e}")
        
        return business
    
    def _extract_emails(self, soup: BeautifulSoup) -> List[str]:
        """Extract email addresses from website"""
        import re
        text = soup.get_text()
        emails = re.findall(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b', text)
        return list(set(emails))
    
    def _extract_social_links(self, soup: BeautifulSoup) -> Dict:
        """Extract social media profile links"""
        social = {}
        
        for link in soup.find_all('a', href=True):
            href = link['href']
            if 'instagram.com' in href:
                social['instagram'] = href
            elif 'facebook.com' in href:
                social['facebook'] = href
            elif 'twitter.com' in href or 'x.com' in href:
                social['twitter'] = href
        
        return social
    
    def _summarize_website(self, text: str) -> str:
        """Use GPT-4 to summarize website content"""
        from openai import OpenAI
        
        client = OpenAI()
        
        response = client.chat.completions.create(
            model="gpt-4-turbo-preview",
            messages=[
                {"role": "system", "content": "Summarize this business website in 2-3 sentences."},
                {"role": "user", "content": text[:4000]}  # Truncate to fit context
            ],
            max_tokens=150
        )
        
        return response.choices[0].message.content

3. Lead Scoring with Machine Learning

# ml/lead_scorer.py
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import StandardScaler
import pandas as pd
import numpy as np
 
class LeadScorer:
    """
    ML model to score lead quality (0-100)
    
    Features:
    - Business metrics (rating, review count, followers)
    - Website quality (has site, SSL, mobile-friendly)
    - Social presence (Instagram, Facebook followers)
    - Competitor analysis (similar businesses using our service)
    """
    
    def __init__(self):
        self.model = RandomForestClassifier(n_estimators=100, random_state=42)
        self.scaler = StandardScaler()
        self.trained = False
    
    def train(self, historical_leads: pd.DataFrame):
        """
        Train on historical data
        
        DataFrame columns:
        - rating: float (1-5)
        - review_count: int
        - has_website: bool
        - instagram_followers: int
        - facebook_followers: int
        - response_rate: float (target variable)
        """
        features = [
            'rating',
            'review_count',
            'has_website',
            'instagram_followers',
            'facebook_followers',
            'website_quality_score'
        ]
        
        X = historical_leads[features]
        y = (historical_leads['response_rate'] > 0.05).astype(int)  # Binary: responded or not
        
        X_scaled = self.scaler.fit_transform(X)
        
        self.model.fit(X_scaled, y)
        self.trained = True
        
        print(f"Model trained on {len(X)} leads")
        print(f"Feature importances: {dict(zip(features, self.model.feature_importances_))}")
    
    def score_lead(self, lead: Dict) -> float:
        """
        Score a single lead (0-100)
        
        Returns:
            score: Higher = better quality lead
        """
        if not self.trained:
            # Use heuristic scoring if model not trained
            return self._heuristic_score(lead)
        
        # Extract features
        features = {
            'rating': lead.get('rating', 0),
            'review_count': lead.get('review_count', 0),
            'has_website': 1 if lead.get('website') else 0,
            'instagram_followers': self._get_instagram_followers(lead),
            'facebook_followers': self._get_facebook_followers(lead),
            'website_quality_score': self._assess_website_quality(lead.get('website'))
        }
        
        X = pd.DataFrame([features])
        X_scaled = self.scaler.transform(X)
        
        # Predict probability
        proba = self.model.predict_proba(X_scaled)[0][1]
        
        # Convert to 0-100 scale
        score = proba * 100
        
        return score
    
    def _heuristic_score(self, lead: Dict) -> float:
        """Fallback scoring without ML model"""
        score = 0
        
        # Rating (0-25 points)
        rating = lead.get('rating', 0)
        score += (rating / 5) * 25
        
        # Review count (0-25 points)
        review_count = lead.get('review_count', 0)
        score += min(review_count / 100, 1) * 25
        
        # Website (0-20 points)
        if lead.get('website'):
            score += 20
        
        # Social media (0-15 points)
        if lead.get('social_media', {}).get('instagram'):
            score += 10
        if lead.get('social_media', {}).get('facebook'):
            score += 5
        
        # Email availability (0-15 points)
        if lead.get('emails'):
            score += 15
        
        return score
    
    def _get_instagram_followers(self, lead: Dict) -> int:
        """Fetch Instagram follower count"""
        instagram_url = lead.get('social_media', {}).get('instagram')
        
        if not instagram_url:
            return 0
        
        # Use Instagram API or scraping to get follower count
        # Simplified for example
        return lead.get('instagram_followers', 0)
    
    def _assess_website_quality(self, url: str) -> float:
        """Score website quality (0-1)"""
        if not url:
            return 0
        
        score = 0
        
        try:
            response = requests.get(url, timeout=5)
            
            # SSL (0.3 points)
            if url.startswith('https://'):
                score += 0.3
            
            # Status code (0.2 points)
            if response.status_code == 200:
                score += 0.2
            
            # Mobile friendly (0.3 points)
            soup = BeautifulSoup(response.content, 'html.parser')
            viewport = soup.find('meta', attrs={'name': 'viewport'})
            if viewport:
                score += 0.3
            
            # Contact info (0.2 points)
            if 'contact' in response.text.lower():
                score += 0.2
        
        except:
            pass
        
        return score

4. GPT-4 Personalized Email Generation

# outreach/email_generator.py
from openai import OpenAI
from typing import Dict, List
 
class EmailGenerator:
    """
    Generate personalized cold emails using GPT-4
    
    Features:
    - Hyper-personalization based on research
    - Multiple frameworks (AIDA, PAS, Before-After-Bridge)
    - A/B testing variants
    - Follow-up sequences
    """
    
    def __init__(self):
        self.client = OpenAI()
        self.case_studies = self._load_case_studies()
    
    def generate_cold_email(
        self,
        lead: Dict,
        framework: str = "AIDA"
    ) -> Dict:
        """
        Generate personalized cold email
        
        Args:
            lead: Enriched lead data (name, pain points, website summary, etc.)
            framework: "AIDA", "PAS", or "BAB"
        
        Returns:
            {
                'subject': str,
                'body': str,
                'ps': str
            }
        """
        # Select relevant case study
        case_study = self._match_case_study(lead)
        
        prompt = self._build_email_prompt(lead, case_study, framework)
        
        response = self.client.chat.completions.create(
            model="gpt-4-turbo-preview",
            messages=[
                {
                    "role": "system",
                    "content": """You are an expert cold email copywriter.
                    
                    Rules:
                    1. Keep emails under 150 words
                    2. Use specific details about the prospect
                    3. Lead with value, not features
                    4. Include social proof (case studies)
                    5. Clear, singular CTA
                    6. Conversational tone
                    7. No hype or exaggeration"""
                },
                {
                    "role": "user",
                    "content": prompt
                }
            ],
            temperature=0.8,
            max_tokens=500
        )
        
        email_content = response.choices[0].message.content
        
        # Parse email into subject + body + PS
        parts = self._parse_email_parts(email_content)
        
        return parts
    
    def _build_email_prompt(
        self,
        lead: Dict,
        case_study: Dict,
        framework: str
    ) -> str:
        """Build GPT-4 prompt with lead-specific details"""
        prompt = f"""
        Write a personalized cold email to {lead['name']}, a {lead['categories'][0]} in {lead['neighborhood']}.
        
        PROSPECT RESEARCH:
        - Website summary: {lead.get('website_summary', 'No website')}
        - Social media: {len(lead.get('social_media', {}))} platforms
        - Key pain point: {lead.get('pain_points', ['Growing their online presence'])[0]}
        - Rating: {lead.get('rating', 0)} stars ({lead.get('review_count', 0)} reviews)
        
        OUR OFFER:
        We help local businesses grow through social media marketing and web design.
        
        RELEVANT CASE STUDY:
        - Client: {case_study['name']} ({case_study['industry']})
        - Result: {case_study['result']}
        - Timeframe: {case_study['timeframe']}
        
        FRAMEWORK: {framework}
        {self._get_framework_guide(framework)}
        
        Generate:
        1. Subject line (7-10 words, specific and intriguing)
        2. Email body (120-150 words)
        3. P.S. line (optional, adds urgency or social proof)
        
        Make it conversational, specific to {lead['name']}, and compelling.
        """
        
        return prompt
    
    def _get_framework_guide(self, framework: str) -> str:
        """Get email framework structure"""
        frameworks = {
            "AIDA": """
            - Attention: Hook with specific observation about their business
            - Interest: Mention their pain point
            - Desire: Show case study result
            - Action: Clear CTA (calendar link or reply)
            """,
            "PAS": """
            - Problem: Identify specific problem they face
            - Agitate: Show consequences of not solving it
            - Solve: Present solution with case study
            """,
            "BAB": """
            - Before: Describe their current situation
            - After: Paint picture of success (use case study)
            - Bridge: Show how we get them there
            """
        }
        
        return frameworks.get(framework, frameworks["AIDA"])
    
    def _match_case_study(self, lead: Dict) -> Dict:
        """Select most relevant case study for this lead"""
        # Match by industry/category
        lead_category = lead.get('categories', [''])[0].lower()
        
        for case_study in self.case_studies:
            if any(cat in lead_category for cat in case_study['industries']):
                return case_study
        
        # Default to most impressive result
        return self.case_studies[0]
    
    def _load_case_studies(self) -> List[Dict]:
        """Load client success stories"""
        return [
            {
                'name': 'FitCheck',
                'industry': 'Fashion Tech',
                'industries': ['fashion', 'retail', 'boutique'],
                'result': '300% user growth in one quarter',
                'timeframe': '3 months'
            },
            {
                'name': 'Workwear',
                'industry': 'B2B Fashion',
                'industries': ['fashion', 'corporate', 'professional'],
                'result': '$2,000+ monthly revenue increase',
                'timeframe': '4 months'
            },
            {
                'name': 'Gloss Authority',
                'industry': 'Mobile Detailing',
                'industries': ['automotive', 'service', 'mobile'],
                'result': '150% lead increase and 2x ROI',
                'timeframe': '6 months'
            },
            {
                'name': 'Piccola Cucina',
                'industry': 'Restaurant',
                'industries': ['restaurant', 'food', 'dining'],
                'result': '15% sales increase in 5 weeks',
                'timeframe': '5 weeks'
            },
            {
                'name': 'Capio Tattoo',
                'industry': 'Creative Arts',
                'industries': ['creative', 'art', 'studio'],
                'result': '10,000+ Instagram followers',
                'timeframe': '6 months'
            }
        ]
    
    def _parse_email_parts(self, email_content: str) -> Dict:
        """Parse GPT-4 output into structured email"""
        lines = email_content.split('\n')
        
        subject = ""
        body_lines = []
        ps = ""
        
        for line in lines:
            line = line.strip()
            if line.lower().startswith('subject:'):
                subject = line.split(':', 1)[1].strip()
            elif line.lower().startswith('p.s.') or line.lower().startswith('ps:'):
                ps = line
            elif line:
                body_lines.append(line)
        
        body = '\n\n'.join(body_lines)
        
        return {
            'subject': subject or "Quick question about your business",
            'body': body,
            'ps': ps
        }
    
    def generate_followup_sequence(
        self,
        lead: Dict,
        original_email: Dict
    ) -> List[Dict]:
        """
        Generate 3-email follow-up sequence
        
        Day 0: Initial email
        Day 3: Follow-up #1 (value add)
        Day 7: Follow-up #2 (case study deep dive)
        Day 14: Follow-up #3 (breakup email)
        """
        followups = []
        
        # Follow-up 1: Add value
        followup1 = self.client.chat.completions.create(
            model="gpt-4-turbo-preview",
            messages=[
                {
                    "role": "system",
                    "content": "Write a brief follow-up email (50-75 words) that adds value without being pushy."
                },
                {
                    "role": "user",
                    "content": f"""
                    Original email subject: {original_email['subject']}
                    Prospect: {lead['name']}
                    
                    Write follow-up that:
                    1. Acknowledges they're busy
                    2. Shares quick tip or insight relevant to their business
                    3. Soft CTA
                    """
                }
            ]
        ).choices[0].message.content
        
        followups.append({
            'day': 3,
            'subject': f"Re: {original_email['subject']}",
            'body': followup1
        })
        
        # Follow-up 2: Case study deep dive
        case_study = self._match_case_study(lead)
        
        followup2 = self.client.chat.completions.create(
            model="gpt-4-turbo-preview",
            messages=[
                {
                    "role": "system",
                    "content": "Write a case study-focused follow-up (75-100 words)."
                },
                {
                    "role": "user",
                    "content": f"""
                    Prospect: {lead['name']}
                    Case study: {case_study['name']} - {case_study['result']}
                    
                    Write follow-up that:
                    1. Shares detailed case study
                    2. Explains how it's relevant to them
                    3. Offers free consultation
                    """
                }
            ]
        ).choices[0].message.content
        
        followups.append({
            'day': 7,
            'subject': f"How {case_study['name']} achieved {case_study['result']}",
            'body': followup2
        })
        
        # Follow-up 3: Breakup email
        followup3 = """
        Hi {name},
        
        I haven't heard back so I'll assume this isn't a priority right now.
        
        If things change, feel free to reach out. I'll be here.
        
        Best of luck with your business!
        """.format(name=lead['name'].split()[0])
        
        followups.append({
            'day': 14,
            'subject': "Closing the loop",
            'body': followup3.strip()
        })
        
        return followups

Results

Platform Metrics (12 Months)

Metric Value
Prospects Processed 62,000+
Qualified Leads Generated 5,200/month
Emails Sent 18,500/month
Response Rate 8.2% (vs 2% manual)
Meeting Booking Rate 3.1%
Client Acquisition Rate 3x increase

Client Success Stories

FitCheck (Fashion Tech)

Workwear (B2B Fashion)

Gloss Authority (Mobile Detailing)

Piccola Cucina (Restaurant)

Cost Savings

Metric Manual Process Automated System Savings
Time per 100 leads 20 hours 45 min 96% faster
Cost per lead $15-20 $0.50 97% cheaper
Response rate 2% 8.2% 4x better
Monthly labor cost $9,000 $300 (compute) $8,700 saved

Challenges & Solutions

Challenge 1: Email Deliverability

Problem: 40% of cold emails went to spam, killing response rates.

Solution: Multi-pronged approach

Result: Spam rate dropped to 8%, inbox rate increased to 85%.

Challenge 2: Scraping Detection

Problem: Yelp and Google Maps blocked our scrapers after 100-200 requests.

Solution:

Result: Successfully scraped 5k+ businesses/day without blocks.

Challenge 3: GPT-4 Hallucinations

Problem: GPT-4 occasionally invented fake case study details or made up statistics.

Solution: Structured prompts with validation

# Add validation layer
def validate_email_content(email: str, lead: Dict) -> bool:
    # Check for exact case study details
    if any(study['name'] in email for study in CASE_STUDIES):
        # Verify numbers match case study
        if not verify_case_study_facts(email):
            return False
    
    # Check for suspicious claims
    suspicious = ["guarantee", "100%", "instant", "overnight"]
    if any(word in email.lower() for word in suspicious):
        return False
    
    return True

Result: Hallucination rate dropped from 12% to <1%.

Future Enhancements

1. Voice AI for Follow-Up Calls

Auto-dial leads with conversational AI:

from elevenlabs import VoiceSettings
 
voice_agent = VoiceAI(
    voice="professional_female",
    script_template="Hi {name}, I sent you an email about {topic}..."
)
 
# Auto-call qualified leads
for lead in high_score_leads:
    voice_agent.call(lead['phone'], personalize_script(lead))

2. LinkedIn Outreach Integration

Expand to LinkedIn for B2B:

# Find decision makers on LinkedIn
linkedin_profiles = find_linkedin_profiles(company_name)
 
# Send InMail with GPT-4 personalization
for profile in linkedin_profiles:
    if profile['title'] in ['Owner', 'CEO', 'Marketing Director']:
        send_linkedin_message(profile, generate_linkedin_message(profile))

3. Predictive Lead Scoring

Use historical conversion data to improve scoring:

# Train on closed deals
X = features_from_leads(closed_deals)
y = [1 if deal.converted else 0 for deal in closed_deals]
 
model = XGBClassifier()
model.fit(X, y)
 
# Predict conversion probability
conversion_prob = model.predict_proba(new_lead_features)

Conclusion

Building an AI-powered lead generation system transformed Lume's client acquisition:

Key Technical Wins:

Technologies: Python, LangChain, OpenAI GPT-4, React, MongoDB, BeautifulSoup, Selenium, SendGrid

Timeline: 8 weeks from prototype to production

Impact: Enabled 15+ local businesses to scale digital marketing, achieving 300% growth for FitCheck, $2k+ revenue for Workwear, and 2x ROI for Gloss Authority

This project proved that agentic AI + web scraping + personalization can automate complex workflows that previously required human expertise—turning cold outreach from a numbers game into a precision instrument!


Additional Resources