Agentic Browser Automation Without Giving AI Your Login
A beginner-friendly tutorial for building scheduled or queue-driven agentic browser automation with Selenium, noVNC manual login, and a narrow API wrapper so agents can draft website forms without seeing credentials or controlling your PC.
AI-powered · Limited to 20 requests per hour

Quick Answer
If you only need a one-off browsing session, use the browser already built into ChatGPT, Codex, Claude Desktop, or your favorite AI tool.
This tutorial is for a different problem: agentic automation that needs to run on a schedule or from a queue. Think daily form drafts, recurring portal updates, internal admin chores, or a workflow that should keep running on a server while your laptop is closed.
scheduler or queue
-> agentic workflow
-> your small wrapper API
-> Selenium WebDriver
-> browser container
-> logged-in websiteYou log in yourself through a visible noVNC browser. The AI only sends structured requests to your wrapper, such as "create this draft form" or "take a screenshot." That keeps login out of the AI conversation and avoids letting a computer-use agent operate your personal machine.
The pattern is useful when a site requires login, has no public API, and you want an agentic workflow that can run on a small server instead of depending on your PC being awake 24/7.
The Problem
Many useful workflows are trapped behind websites.
Maybe it is a partner portal, an internal admin page, a school form, a membership site, or some boring dashboard that only exists in a browser. You want AI to prepare the data and fill the form, but the website does not offer an API.
The tempting shortcut is dangerous:
Give the AI my login
Ask it to open the website
Let it drive my browserThat mixes too many responsibilities. The AI sees secrets. The AI can click anywhere. Your laptop becomes part of the production workflow. If the workflow should run overnight or while you are away, your personal computer now has to stay on.
A better split is:
Human owns login.
Browser container owns the session.
Wrapper API owns the allowed actions.
Agent owns the recurring workflow.That is the design this tutorial builds.
Why Not Just Use ChatGPT's Browser?
Built-in AI browsers are good for interactive work. You are present. You ask a question. The tool opens pages, reads, clicks, and reports back.
Agentic automation has a different shape. It may run every morning at 8:00. It may pick up jobs from a queue. It may retry after a website timeout. It may need to create a draft while you are asleep and notify you only when the browser session needs login.
| Use case | Built-in AI browser | Wrapper API plus Selenium |
|---|---|---|
| One-off research or browsing | Good fit | Usually unnecessary |
| Daily or event-driven automation | Weak fit | Good fit |
| Runs while your PC is off | No | Yes, on a server |
| Stable tool contract for an agent | Limited | Explicit HTTP API |
| Login handling | Easy to mix with the chat | Human logs in through noVNC |
The point is not that built-in browsers are bad. They solve the interactive case. This pattern solves the automation case.
A Daily Automation Example
Imagine a private vendor portal with no API. Every weekday morning, you want an agentic workflow to prepare draft requests based on yesterday's operational notes.
8:00 AM cron trigger
-> fetch yesterday's notes from your database or inbox
-> AI writes the request subject and details
-> agent calls POST /portal/draft-request with dry_run=true
-> wrapper fills the logged-in browser form
-> wrapper returns screenshot and status
-> agent notifies you if review, login, or submission approval is neededThat is different from asking a chat tool to browse once. The agent has a recurring job, a retry path, a tool contract, and a way to stop when human authentication or approval is required.
For a first version, keep the final action human-reviewed. Let the agent create drafts. Let the wrapper return screenshots. Let a human approve the final submit step when the action matters.
What You Are Building
For the tutorial, the target website is still a private vendor portal with a "New Request" form. The form has a subject, details, priority, and a "Save Draft" button. The real website could be different, but the architecture stays the same.
| Piece | Job |
|---|---|
| Scheduler or queue | Starts the automation daily or when a job arrives |
| Agentic workflow | Prepares data, calls tools, handles retry and review states |
| Wrapper API | Converts safe HTTP requests into Selenium actions |
| Selenium container | Runs Chrome and exposes WebDriver on 4444 |
| noVNC | Lets you see and control the browser on 7900 |
The official Docker Selenium README documents the standalone browser images, WebDriver port, noVNC port, and the common --shm-size 2g setup. The tutorial below wraps that browser in a smaller API so your AI does not talk to WebDriver directly.
What You Need
- Docker and Docker Compose
- A website account you are allowed to automate
- A basic terminal
- An agentic workflow runner that can make HTTP requests, such as n8n, cron plus a worker, or an agent framework
Step 1: Start A Visible Browser Container
Create a folder:
mkdir browser-automation-wrapper
cd browser-automation-wrapperCreate docker-compose.yml:
services:
selenium:
image: selenium/standalone-chrome:latest
shm_size: "2g"
ports:
- "127.0.0.1:4444:4444"
- "127.0.0.1:7900:7900"
environment:
SE_VNC_PASSWORD: "change-this-password"Start it:
docker compose up -dOpen the live browser:
http://127.0.0.1:7900/?autoconnect=1&resize=scale&password=change-this-passwordUse that browser to log in to your target website yourself. Do not ask AI to log in. Do not paste your password into an AI chat. The browser session now lives inside the container.
Step 2: Keep WebDriver Private
WebDriver is powerful. If someone can reach it, they can control the browser. That is why the compose file binds 4444 and 7900 to 127.0.0.1.

| Do | Do Not |
|---|---|
| Bind Selenium to localhost | Expose 4444 to the public internet |
| Use noVNC only for manual login and debugging | Give noVNC access to an AI agent |
| Expose a narrow wrapper API | Let AI send arbitrary WebDriver commands |
This is the main safety boundary. The AI should not receive a general browser-control endpoint. It should receive a small API with only the actions you approve.
Step 3: Define The Allowed Workflow
Before writing code, describe the exact task. For the example portal:
Allowed action:
Create a draft request in the vendor portal.
Inputs:
subject
details
priority
dry_run
Rules:
Never log in.
Never change account settings.
Never submit the final form unless dry_run is false.
Return needs_login if the browser is on the login page.
Return a screenshot path for debugging.This is where the wrapper becomes safer than a computer-use agent. The agent cannot decide to click account settings because your API does not expose that action.
Step 4: Ask AI To Generate The Wrapper
Here is a prompt you can give to your coding AI. Replace the fake portal URL and selector names with your real site details after you inspect the page.
Build a small FastAPI application that wraps Selenium Remote WebDriver.
Goal:
Create a narrow API for filling a logged-in website form. The website does not
provide an API. The user will log in manually through noVNC. The app must never
handle usernames, passwords, 2FA codes, or cookies directly.
Stack:
- Python 3.12
- FastAPI
- Selenium Python package
- Remote WebDriver at SELENIUM_URL, default http://selenium:4444/wd/hub
Endpoints:
- GET /health
- POST /portal/draft-request
- GET /debug/screenshot
Security requirements:
- Require Authorization: Bearer $AUTOMATION_API_TOKEN on every endpoint except /health.
- Do not accept arbitrary CSS selectors, URLs, or JavaScript from the API caller.
- Keep selectors in a server-side SELECTORS dictionary.
- If the browser is on a login page, return {"status": "needs_login"}.
- Default dry_run to true. In dry_run mode, fill fields but do not click Save Draft.
- Log actions, but never log secrets or full cookies.
Draft request input:
- subject: string, max 120 chars
- details: string, max 4000 chars
- priority: one of low, normal, high
- dry_run: boolean, default true
Selenium behavior:
- Reuse one browser session while the app is alive.
- Use WebDriverWait and expected_conditions, not blind sleeps.
- Navigate to https://example-portal.invalid/requests/new.
- Fill subject, details, and priority.
- Click Save Draft only when dry_run is false.
- Return status, current_url, page_title, and screenshot_file.
Operational behavior:
- Provide Dockerfile and docker-compose.yml.
- In docker-compose.yml, bind wrapper API to 127.0.0.1:8088 for local testing.
- Selenium and noVNC must not be exposed publicly.
- Add clear comments only where intent is not obvious.That prompt matters more than the framework. It tells the AI where the boundary is: no login handling, no arbitrary browser control, no public WebDriver.
Step 5: Minimal Wrapper Example
A simple generated wrapper will look like this. Create form-wrapper/main.py:
import os
from typing import Literal
from fastapi import FastAPI, Header, HTTPException
from pydantic import BaseModel, Field
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import Select, WebDriverWait
SELENIUM_URL = os.getenv("SELENIUM_URL", "http://selenium:4444/wd/hub")
API_TOKEN = os.environ["AUTOMATION_API_TOKEN"]
PORTAL_DRAFT_URL = "https://example-portal.invalid/requests/new"
SELECTORS = {
"subject": "#request_subject",
"details": "#request_details",
"priority": "#request_priority",
"save_draft": "button[data-action='save-draft']",
}
app = FastAPI()
driver = None
class DraftRequest(BaseModel):
subject: str = Field(max_length=120)
details: str = Field(max_length=4000)
priority: Literal["low", "normal", "high"] = "normal"
dry_run: bool = True
def require_token(authorization: str | None) -> None:
if authorization != f"Bearer {API_TOKEN}":
raise HTTPException(status_code=401, detail="Unauthorized")
def browser():
global driver
if driver is None:
options = webdriver.ChromeOptions()
driver = webdriver.Remote(command_executor=SELENIUM_URL, options=options)
return driver
def on_login_page(current_url: str, title: str) -> bool:
text = f"{current_url} {title}".lower()
return "login" in text or "sign-in" in text or "signin" in text
@app.get("/health")
def health():
return {"status": "ok"}
@app.post("/portal/draft-request")
def draft_request(payload: DraftRequest, authorization: str | None = Header(default=None)):
require_token(authorization)
page = browser()
wait = WebDriverWait(page, 20)
page.get(PORTAL_DRAFT_URL)
if on_login_page(page.current_url, page.title):
return {"status": "needs_login", "current_url": page.current_url}
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, SELECTORS["subject"]))).clear()
page.find_element(By.CSS_SELECTOR, SELECTORS["subject"]).send_keys(payload.subject)
page.find_element(By.CSS_SELECTOR, SELECTORS["details"]).clear()
page.find_element(By.CSS_SELECTOR, SELECTORS["details"]).send_keys(payload.details)
Select(page.find_element(By.CSS_SELECTOR, SELECTORS["priority"])).select_by_value(payload.priority)
if not payload.dry_run:
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, SELECTORS["save_draft"]))).click()
screenshot_file = "/tmp/latest-screenshot.png"
page.save_screenshot(screenshot_file)
return {
"status": "filled" if payload.dry_run else "draft_saved",
"current_url": page.current_url,
"page_title": page.title,
"screenshot_file": screenshot_file,
}
@app.get("/debug/screenshot")
def screenshot(authorization: str | None = Header(default=None)):
require_token(authorization)
page = browser()
screenshot_file = "/tmp/latest-screenshot.png"
page.save_screenshot(screenshot_file)
return {"status": "ok", "screenshot_file": screenshot_file}This is intentionally narrow. It cannot browse arbitrary URLs. It cannot run arbitrary JavaScript. It cannot fill arbitrary selectors chosen by the AI. You edit the server-side selectors after inspecting the target page.
Create form-wrapper/Dockerfile:
FROM python:3.12-slim
WORKDIR /app
RUN pip install --no-cache-dir fastapi uvicorn selenium pydantic
COPY main.py /app/main.py
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8088"]Update docker-compose.yml:
services:
selenium:
image: selenium/standalone-chrome:latest
shm_size: "2g"
ports:
- "127.0.0.1:4444:4444"
- "127.0.0.1:7900:7900"
environment:
SE_VNC_PASSWORD: "change-this-password"
form-wrapper:
build: ./form-wrapper
depends_on:
- selenium
ports:
- "127.0.0.1:8088:8088"
environment:
SELENIUM_URL: "http://selenium:4444/wd/hub"
AUTOMATION_API_TOKEN: "${AUTOMATION_API_TOKEN}"Create .env:
AUTOMATION_API_TOKEN=replace-this-with-a-long-random-tokenRestart with the wrapper:
docker compose up -d --buildStep 6: Test It Without AI
Load the token into your terminal:
set -a
. ./.env
set +aCall the wrapper:
curl -X POST http://127.0.0.1:8088/portal/draft-request \
-H "Authorization: Bearer $AUTOMATION_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"subject": "Quarterly vendor check-in",
"details": "Please prepare a draft request for the next vendor review.",
"priority": "normal",
"dry_run": true
}'If the response says needs_login, open noVNC and log in manually. Then run the same request again.
Keep dry_run enabled until you have watched the browser fill the correct fields. Only then should you allow the wrapper to click "Save Draft." For sensitive workflows, keep the final submit button outside automation and let a human review.
Step 7: Let The Agent Call The Wrapper
After the wrapper works by hand, your agentic workflow can call it like any other HTTP API. This could be an n8n workflow, a cron job, a small worker process, a queue consumer, or an agent framework that supports HTTP tools.
You can create draft requests through this API:
POST http://127.0.0.1:8088/portal/draft-request
Authorization: Bearer $AUTOMATION_API_TOKEN
Content-Type: application/json
Rules:
- Never ask for my website password.
- Never ask for 2FA codes.
- Never call Selenium or noVNC directly.
- If the API returns needs_login, tell me to log in through noVNC.
- Use dry_run=true unless I explicitly approve saving the draft.The AI now handles language and structure. Your wrapper handles browser mechanics. The logged-in browser handles the website. Those boundaries make the workflow easier to reason about.
The important agent behavior is the stop condition. If the wrapper returns needs_login, the agent should not try to solve login by asking for credentials. It should notify you, pause the job, and continue only after you have logged in through noVNC.
Step 8: Move It To A Server

Local is the safest first test. For 24/7 automation, move the same Docker Compose stack to a small VPS.
| Local setup | Server setup |
|---|---|
Use 127.0.0.1:7900 directly | Use an SSH tunnel to reach noVNC |
| Wrapper API bound to localhost | Expose wrapper through HTTPS, VPN, or Tailscale |
| Your laptop must stay on | The VPS keeps the browser session alive |
On a remote server, do not expose noVNC publicly. Tunnel to it when you need to log in or debug:
ssh -L 7900:127.0.0.1:7900 user@your-serverThen open this on your own machine:
http://127.0.0.1:7900/?autoconnect=1&resize=scale&password=change-this-passwordIf your AI platform needs to call the wrapper from outside the server, expose only the wrapper API, protect it with HTTPS and a strong bearer token, and consider IP allowlists or a private network. Keep 4444 and 7900 private.
Session Persistence
The beginner version keeps the browser session alive while the Selenium container stays running. That is often enough for a first workflow.
You can mount a Chrome profile volume later if you need login to survive container restarts, but test it carefully. Some websites expire cookies, require fresh 2FA, or invalidate sessions after IP changes. Chrome profile locks can also make restarts annoying if the browser exits uncleanly.
Start with the simpler rule:
Keep the container running.
Return needs_login when the session expires.
Let the human re-authenticate through noVNC.Guardrails Worth Keeping
| Risk | Guardrail |
|---|---|
| AI sees credentials | Human logs in through noVNC; wrapper never accepts passwords |
| AI clicks the wrong thing | Wrapper exposes allowlisted actions, not full browser control |
| Website layout changes | Use explicit selectors, screenshots, and needs_review responses |
| Automation submits bad data | Default to dry_run and save drafts before final submit |
| Public remote-control endpoint | Keep WebDriver and noVNC bound to localhost or private network |
When Not To Use This
Do not use this pattern when the website has a proper API. Use the API. It will be more stable, easier to monitor, and less likely to break when the page changes.
Also avoid this pattern when the site forbids automation, uses CAPTCHA as a clear anti-automation boundary, or handles actions where one bad click is expensive. In those cases, AI can still prepare the data, but a human should perform the browser action.
The Useful Boundary
The important idea is not Selenium. Selenium is replaceable. The useful idea is the boundary.
AI should not own your login.
AI should not own your whole browser.
AI can own structured draft data.
Your API should decide what browser actions are allowed.That is a calmer way to automate websites without APIs. You still get AI-generated drafts, 24/7 server-side execution, and visible browser debugging. But you do not have to hand an AI agent your password or let it operate your personal computer.
License
Article text © 2026 Mark Huang. Licensed under Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) unless otherwise noted. Article text is licensed for non-commercial sharing with attribution to the original article URL. Commercial use requires prior written permission and must clearly cite the original source.
Code snippets, screenshots, third-party assets, and site source code may have separate terms.
Suggested attribution: Based on "Agentic Browser Automation Without Giving AI Your Login" by Mark Huang, originally published at https://markhuang.ai/blog/ai-browser-automation-without-sharing-login.
Related Articles

Try Dense-Mem in 5 Minutes With the Hosted Demo
A quick tutorial for using the hosted Dense-Mem test instance, connecting Claude Code and Codex to the same temporary memory, and seeing how shared context helps AI work smarter.
Read article
Dense-Mem Quick Start: Give Claude Code and Codex the Same Memory
A beginner-friendly tutorial for spinning up a local Dense-Mem server, creating your first memory key, and connecting Claude Code and Codex to one shared AI memory brain.
Read article
Secure Dense-Mem on Vultr with Traefik
A nontechnical walkthrough for launching Dense-Mem on a Vultr cloud server with Traefik, HTTPS, private control-portal access, and shared memory for personal, family, or work AI tools.
Read articleStay updated
Articles on Go, AI/LLMs, and distributed systems. No spam.
Comments
Loading comments...