June 27, 2026
Selenium Page Object Model Tutorial
Learn Selenium Page Object Model with practical Python and Java examples, locator strategies, wait handling, and maintainability tips for SDETs and QA engineers.
If you have maintained a Selenium suite for more than a few months, you already know the usual pain points: duplicated locators, brittle selectors, and tests that break because a button moved to a different component. The Page Object Model is the pattern most teams reach for to tame that complexity. It is not magic, but when used well, it gives you a cleaner boundary between test intent and UI mechanics.
This Selenium Page Object Model tutorial walks through the pattern from a practical SDET point of view. I will show why it helps, what it does not solve, and how to implement it in both Python and Java. I will also cover common mistakes, where to place waits, and when a different approach, including a no-code alternative like Endtest, an agentic AI [Test automation](https://en.wikipedia.org/wiki/Test_automation) platform,, can be a better fit for some teams.
What the Page Object Model actually is
Page Object Model, often shortened to POM, is a design pattern for test automation where each page, screen, or meaningful UI surface is represented by a class. That class stores locators and exposes behaviors, while the test itself focuses on business flow.
At a high level:
- Tests say what the user is trying to do.
- Page objects know how to do it in the UI.
- Locators and wait logic live in one place instead of being scattered across many tests.
The real goal of Selenium POM is not abstraction for its own sake, it is reducing the cost of UI change.
A good page object should answer questions like:
- How do I enter credentials on the login page?
- How do I submit the form?
- How do I detect that the page is ready?
A bad page object becomes a dumping ground for every utility method in the codebase. That is when the pattern starts to look heavier than the problem it solves.
Why Selenium suites become hard to maintain without POM
Directly scripting Selenium actions in tests is fine when you are exploring a product or writing a tiny proof of concept. It becomes painful when the suite grows.
Consider this style of test:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
browser = webdriver.Chrome() browser.get(“https://example.com/login”)
browser.find_element(By.ID, “email”).send_keys(“qa@example.com”) browser.find_element(By.ID, “password”).send_keys(“secret”) browser.find_element(By.CSS_SELECTOR, “button[type=’submit’]”).click()
WebDriverWait(browser, 10).until( EC.visibility_of_element_located((By.ID, “dashboard”)) )
This is readable enough for one test. Now imagine 80 tests with the same login sequence, and the login form changes. You will update the same locators everywhere, and if a wait is slightly off, debugging becomes tedious.
POM helps because the login flow becomes a reusable object, and the test files stay focused on behavior.
What belongs in a page object, and what does not
A page object should usually contain:
- Locators for elements on that page or component
- Interaction methods, such as
login(),search(),add_to_cart() - Assertions that are tightly tied to page state, sometimes
- A page-ready check, such as waiting for a key element
A page object should usually not contain:
- Test data setup
- Cross-page business flows that belong in the test or a workflow layer
- Assertions for unrelated application behavior
- Environment-specific configuration
- Low-level driver creation
One practical rule I use is this:
If a method reads like a user action on a specific page, it likely belongs in the page object. If it reads like a business scenario spanning multiple screens, it probably belongs in the test or a flow class.
Selenium POM example in Python
Here is a small but realistic Python implementation using Selenium. I am using a base page for shared wait helpers, a login page object, and a test that reads like a user flow.
Base page
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
class BasePage: def init(self, driver): self.driver = driver self.wait = WebDriverWait(driver, 10)
def click(self, locator):
self.wait.until(EC.element_to_be_clickable(locator)).click()
def type(self, locator, text):
element = self.wait.until(EC.visibility_of_element_located(locator))
element.clear()
element.send_keys(text)
Login page
from selenium.webdriver.common.by import By
from base_page import BasePage
class LoginPage(BasePage): EMAIL = (By.ID, “email”) PASSWORD = (By.ID, “password”) SUBMIT = (By.CSS_SELECTOR, “button[type=’submit’]”) DASHBOARD = (By.ID, “dashboard”)
def open(self, url):
self.driver.get(url)
return self
def login(self, email, password):
self.type(self.EMAIL, email)
self.type(self.PASSWORD, password)
self.click(self.SUBMIT)
return self
def is_logged_in(self):
return self.wait.until(lambda d: d.find_element(*self.DASHBOARD).is_displayed())
Test
from selenium import webdriver
from login_page import LoginPage
def test_user_can_log_in(): driver = webdriver.Chrome() try: page = LoginPage(driver) page.open(“https://example.com/login”).login(“qa@example.com”, “secret”) assert page.is_logged_in() finally: driver.quit()
This is already a meaningful improvement over repeating raw Selenium calls in every test. The test itself now tells a story: open the login page, log in, assert success.
Selenium Page Object Model in Java
Java teams often use the same pattern, sometimes with PageFactory, sometimes with explicit By locators. I prefer explicit locators for most modern suites because they are easier to debug and less magical.
Base page
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.support.ui.ExpectedConditions;
import org.openqa.selenium.support.ui.WebDriverWait;
import java.time.Duration;
public class BasePage { protected WebDriver driver; protected WebDriverWait wait;
public BasePage(WebDriver driver) {
this.driver = driver;
this.wait = new WebDriverWait(driver, Duration.ofSeconds(10));
}
protected void click(By locator) {
wait.until(ExpectedConditions.elementToBeClickable(locator)).click();
}
protected void type(By locator, String text) {
WebElement element = wait.until(ExpectedConditions.visibilityOfElementLocated(locator));
element.clear();
element.sendKeys(text);
} }
Login page
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
public class LoginPage extends BasePage { private final By email = By.id(“email”); private final By password = By.id(“password”); private final By submit = By.cssSelector(“button[type=’submit’]”); private final By dashboard = By.id(“dashboard”);
public LoginPage(WebDriver driver) {
super(driver);
}
public LoginPage open(String url) {
driver.get(url);
return this;
}
public LoginPage login(String userEmail, String userPassword) {
type(email, userEmail);
type(password, userPassword);
click(submit);
return this;
}
public boolean isLoggedIn() {
return wait.until(d -> d.findElement(dashboard).isDisplayed());
} }
Test
import org.junit.jupiter.api.Test;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;
import static org.junit.jupiter.api.Assertions.assertTrue;
public class LoginTest { @Test void userCanLogIn() { WebDriver driver = new ChromeDriver(); try { LoginPage page = new LoginPage(driver); page.open(“https://example.com/login”).login(“qa@example.com”, “secret”); assertTrue(page.isLoggedIn()); } finally { driver.quit(); } } }
The locator strategy matters more than the pattern name
A lot of teams adopt Selenium POM and still end up with flaky tests because the locators are poor. The pattern does not fix bad selector strategy.
Prefer stable locators in this order when possible:
data-testidor similar test-only attributes- Unique IDs
- Accessible roles and labels when supported by your framework and app structure
- Well-scoped CSS selectors
- XPath only when necessary
If your page object relies on long absolute XPath expressions, you are not buying much maintainability. You are just centralizing fragility.
A small example of a better locator choice:
python LOGIN_BUTTON = (By.CSS_SELECTOR, “[data-testid=’login-submit’]”)
This usually ages better than a selector tied to layout structure or a translated label.
Where waits should live in a POM design
Waits are one of the most important topics in Selenium architecture. If you scatter sleep() calls through tests, you are building flakiness into the suite. If you hide all waits deep inside utilities with no clear contract, debugging becomes difficult.
My default approach is:
- Use explicit waits in page objects for element readiness
- Keep tests free of raw timing logic
- Avoid implicit waits unless your team has a very specific reason
The page object should know how long it takes for its important state to appear. For example, a dashboard page object might wait for a key widget or heading before returning control to the test.
This is especially helpful in CI/CD pipelines, where test timing can change due to environment load. If you want a refresher on the broader concept behind this, the Selenium documentation is the right place to check framework-specific wait behavior.
Modeling pages versus components
For older applications with simple page structures, a one-class-per-page approach works fine. For modern apps with repeated UI patterns, component objects can be more useful than page-only classes.
Examples of reusable components:
- Navigation bars
- Modal dialogs
- Date pickers
- Tables and filters
- Toast notifications
A page can compose components instead of owning everything itself.
For example, a checkout page may own the form fields, but delegate the address selector to an AddressModal component. This keeps page objects smaller and avoids copy-paste when the same component appears in multiple places.
Common Page Object Model mistakes
Here are mistakes I see often when reviewing Selenium codebases.
1. Putting assertions everywhere
A page object can expose state checks, but if it becomes an assertion-heavy mini test suite, you lose clarity. Let tests own the scenario-level assertions.
2. Returning raw WebElements
If the test code starts manipulating WebElement objects directly, the abstraction is leaking. Prefer methods such as submit() or select_country("US") instead of returning element handles all the time.
3. Creating a page object for every HTML file
A page object should match a meaningful unit of interaction. One class per route is not always useful, especially in single-page applications where the visible state changes without a full navigation.
4. Sharing too much state through inheritance
A small base page is fine. A giant inheritance tree usually becomes hard to reason about. Prefer composition when a helper is not truly shared behavior.
5. Mixing framework setup with page logic
Browser creation, grid configuration, screenshots, and test data setup belong elsewhere. Keep page classes focused on UI behavior.
How POM helps with flaky tests
POM does not eliminate flakiness by itself, but it makes flakiness easier to diagnose and reduce.
Why?
- Locator changes are centralized
- Wait policies are centralized
- Page readiness checks are reusable
- Tests have fewer moving parts
This means if a login page starts failing, you can inspect one LoginPage class instead of thirty tests. That alone saves a lot of time.
In practice, a maintainable Selenium suite is less about clever abstractions and more about reducing the number of places where timing and selectors can go wrong.
POM in CI/CD pipelines
Once your Selenium suite runs in CI, page object quality becomes even more important. Headless browser differences, slower environments, and parallel execution all expose weak abstractions.
A simple GitHub Actions example might look like this:
name: ui-tests
on: pull_request: push: branches: [main]
jobs: selenium-tests: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-python@v5 with: python-version: “3.12” - run: pip install -r requirements.txt - run: pytest tests/ui
This is not a full production setup, but it highlights the practical point: if your page objects are brittle, CI will expose it quickly. Good POM structure makes failures more readable and updates less painful.
Page Object Model Python versus Page Object Model Java
Both languages support the pattern well, but they tend to feel different in practice.
Python strengths
- Less boilerplate
- Fast to prototype
- Easier for mixed QA and developer teams to read
- Works well with pytest fixtures
Java strengths
- Stronger typing
- Large enterprise ecosystem
- Clear fit for teams already standardized on JUnit, Maven, or Gradle
- Useful when you want stricter compile-time structure
The tradeoff is not about which language is objectively better. It is about what your team can maintain consistently. A small, disciplined Python suite beats a large, inconsistent Java suite every time.
When Selenium POM is a good fit
Selenium POM is a strong choice when:
- You need browser automation across multiple browsers
- Your team is already invested in Selenium
- You have many stable UI workflows to cover
- You want code-level control over waits, retries, and integrations
- Developers and SDETs are comfortable maintaining test code
It is less compelling when:
- The team is small and nobody wants to own framework code
- Non-technical stakeholders need to review or update tests
- You spend more time maintaining the framework than adding coverage
- The app changes often and page classes constantly churn
In those cases, a simpler workflow may be better. For example, Endtest’s no-code testing approach lets teams build tests without maintaining page object classes, drivers, or framework setup. That can be a practical alternative when the bottleneck is framework maintenance rather than test design. Endtest also provides a migration path from Selenium for teams looking to reduce that code burden.
A practical decision framework
When deciding whether to use Selenium POM, ask these questions:
- Do we have enough automation skill on the team to own the framework?
- Are our UI tests stable enough to justify code-based maintenance?
- Do we need custom logic, complex assertions, or integrations that low-code tools may not cover as cleanly?
- Are we optimizing for developer control, or for team-wide accessibility?
If the answer is mostly about control and flexibility, Selenium POM is a good fit. If the answer is mostly about speed of authoring and reducing maintenance overhead, a no-code platform can make more sense.
A few implementation tips that save time later
Here are the habits that make the biggest difference in real projects.
- Keep page object methods short and intention-revealing
- Group locators near the methods that use them
- Name classes after user-facing surfaces, not internal components unless they are reusable
- Use a shared base class only for true cross-cutting helpers
- Return the next page object after navigation when it improves flow readability
- Add explicit wait helpers for known asynchronous states
A small example of page chaining in Python:
python class LoginPage(BasePage): def login(self, email, password): self.type(self.EMAIL, email) self.type(self.PASSWORD, password) self.click(self.SUBMIT) return DashboardPage(self.driver)
That style works well when navigation is deterministic, but do not overuse it. Returning the next page object is useful only when the next state is truly expected and stable.
Final thoughts
The Page Object Model is one of the most useful ideas in Selenium because it gives structure to a problem that quickly becomes messy: UI test maintainability. It works best when you use it to centralize locators, waits, and page-specific actions, not when you turn it into a giant abstraction layer.
If you are building a Selenium suite in Python or Java, start small, keep the boundary clear, and optimize for readable tests. If your team later decides the framework overhead is too high, it is also reasonable to evaluate a simpler path, including no-code tools that remove page object maintenance altogether.
Selenium POM is not the only way to build sustainable UI automation, but it is still a solid default when your team wants code-level control and is ready to own it.