Blog

Stratux
European Edition | 4/4/2024

As many may know, I have taken over responsibility for the Stratux traffic alerting system ... Read More

AutoWWW 1.1 Released
11/9/2014 | Comments: 2

AutoWWW 1.1 has just been released for Linux. Windows builds do not exist yet, but ... Read More

Backups done right
8/12/2013

Since this is a Blog, I've decided that I should start using it as such and therefore, ... Read More

More Blog Entries

Browser API

The complete API to control the browser is handled by one class, called "browser".
The browser-class is a global, omniscient instance that is available in all your scripts and modules.
You don't need to import anything to get access to it.

Many of these method-calls will be generated automatically when you hit the record-button.

Quicklinks:
Loading pages - Tab management - Browser input - Timeouts - Extracting Info - Content Modification - Browser setup

The Browser class specifies the following methods:

browser.load_page(url)

Loads the given URL in the current browser tab and waits until loading has finished.
If the page loads a lot of content dynamically, the page might not really be loaded completely after
load_page finishes. You might need to call time.sleep(seconds) to make sure everything is loaded correctly. In many cases, however, a simple call to load_page() is enough.
Applies to: current tab

browser.load_page_async(url)

The same as load_page but doesn't wait for the page to be loaded but returns immediately.
Can be useful to start loading pages in multiple tabs in parallel. Might be used in conjunction with
browser.is_loading() and browser.wait_for_load().
Applies to: current tab

browser.back()

Go one page back. The same functionality that's known from browsers.
Applies to: current tab

browser.forward()

Go one page forward, can be used if back() was called before. The same functionality that's know from browsers.
Applies to: current tab

browser.create_new_page()

Creates a new tab in the browser. Important: Browser-Tabs in AutoWWW are more or less independent.
Most notably, browser tabs do NOT share cookies and similar stuff. This means, every tab acts as a
different browser. So you can e.g. log into multiple accounts on the same page in different tabs.
Returns: A Tab-ID that can be used as argument for other functions such as select_page or remove_page.

browser.remove_page(tabId)

Removes the given tab. the tabId argument is the value that was returned by create_new_page before.
Returns: True if the tabId is valid, False otherwise.
Applies to: current tab

browser.select_page(tabId)

Switches tabs in the browser. the tabId argument is the value that was returned by create_new_page before.
If you want to switch to the first tab that existed when the browser was launched, use
browser.select_page(browser.default_page). browser.default_page can not be closed.
Returns: True if the tabId is valid, False otherwise.

browser.clear_pages()

Closes all tabs but the first one.

browser.get_num_pages()

Returns: the number of currently opened tabs.

browser.get_selected_page()

Returns: the tabId of the currently opened tab.

browser.set_tab_title(id, title)

Sets the Title of the given tab ID to title
Applies to: provided tab id

browser.click_at(x, y)

Simulates a mouse click at pixel x, y of the page (origin is in the top left corner)
Applies to: current tab

browser.click_element(element_id)

Simulates a click on a page element (such as a form-button) by providing its ID
(e.g. for stuff like <input type="submit" id="sb"/> would require a browser.click_element('sb') )
Returns: True if an element with the given ID was found. False otherwise
Applies to: current tab

browser.set_click_delay(seconds)

After click(), the browser always waits a bit to make sure that all JavaScript animations, etc. are finished before proceeding with script-execution. You may want to set this to 0 if you require faster clicking.
Applies to: complete browser

browser.set_focus_to(element_id)

Similarly to click_element, this function locates a page element (i.e. form fields) by ID and sets the input focus to the specified element. Often used in conjunction with write_text.
Applies to: current tab

browser.write_text(text)

Simulates the user writing text into the browser window. You might need to set input focus first to use this with
form-fields by using set_focus_to or simulating a click on the form element first.
Applies to: current tab

browser.press_key(keyId, keyModifiers, keyText)

Simulates a key press in the browser. It is highly recommended to use write_text instead whenever possible
because it's much easier to use.
Applies to: current tab

browser.set_load_timeout()

Sets the time that load_page() should wait for the page to load until it gives up in seconds.
This defaults to 30 seconds.
Applies to: complete browser

browser.get_load_timeout()

Returns: the timeout specified by set_load_timeout. Defaults to 30 seconds.
Applies to: complete browser

browser.has_error()

Returns: True if the last load_page() operation failed, False otherwise.
Applies to: complete browser

browser.is_loading()

Returns: True if the page is currently loading contents. This function can be useful in conjunction with
load_page_async. After a call to load_page_async, is_loading() will return True until the page has finished loading.
Applies to: current tab

browser.wait_for_load()

Waits until is_loading() returns False. Returns as soon as the page is completely loaded.
Applies to: current tab

browser.get_page_html()

Returns the Html code of the current page.
Applies to: current tab

browser.get_page_text()

Returns: the text that's currently readable on the screen. in an unspecified format.
Use 'print browser.get_page_text()' to see what this looks like.
NOTE: This function only returns the text of the frame that currently has focus.
On a page with multiple frames, you might want to click() into the correct frame first. to focus it.
Applies to: current tab

browser.page_contains(text)

Returns True if the given text is contained in the page's currently focused frame. Use click() to change focus in a different frame if required. Is equivalent to return text in browser.get_page_text().
Applies to: current tab

browser.save_image(src, path)

Will save an image from the website to 'path' on your file system.
The parameter 'src' has to be the image path that's written in <img src="..."> on the HTML page.
It can also be an extract of this path. The first image in the html document that contains the text given in 'src'
will be saved.
example: If the image you want is in the html code like this: <img src="http://example.com/img.png?dummy=foo">,
you might want to call browser.save_image('img.jpg', 'c:/foo.png') or browser.save_image('img.png?dummy=foo', 'c:/foo.png')
or browser.save_image('http://example.com/img.png?dummy=foo', 'c:/foo.png'), and so on.
Applies to: current tab

browser.save_as_image(x, y, path)

Will save the web-element at position x/y to the image to 'path' on your file system.
Applies to: current tab

browser.load_images(boolean)

Pass False as an argument to force the browser to not load images on web pages. Pass True to enable image loading again.
This can help to decrease traffic usage dramatically, especially when multiple AutoWWW instances are running constantly to do some work.
Applies to: complete browser

browser.mute_sound()

Mutes all HTML5 <audio> elements on the current page
Applies to: The current page. When navigating to a different page, sound will be loaded again.

browser.execute_javascript(jsCode)

Executes the JavaScript code given by jsCode in the context of the current page. This can be used to fill in forms, hide page elements, inject content and much more.
Applies to: current tab
Returns: The result of the last executed JavaScript expression. e.g. if you call browser.execute_javascript('var i = 3; i;'), the function would return 3.

browser.resize(width, height)

Resizes the browser-window to the given width/height in pixels. Only the center part of the browser is resized. Not the window itself.
Applies to: complete browser

browser.scroll_to(x, y)

Scrolls to the given position on the page.
Applies to: current tab

browser.set_proxy(hostname, port, username, password)

Makes the currently selected page use the given SOCKS5 Proxy server. username and password might be empty.
Applies to: current tab

browser.set_status_text(text)

Writes the given text to the status-bar of the browser.

browser.show_messagebox(text)

Creates a message box with the given text in the browser.

browser.clear_cookies()

Clears all browser cookies of the current tab
Applies to: current tab

browser.set_javascript_enabled(enabled)

Enables/disables JavaScript (pass True or False)
Applies to: all tabs

browser.set_blacklist()

Pass a string-list containing regexes to be blocked
Applies to: current tab

browser.set_whitelist(list)

Pass a string-list containing regexes to be exceptions for the ones provided to set_blacklist
Applies to: current tab

browser.set_request_overrides(overrides)

Pass a dict RegExp->Url to override requests matching the regexp to be redirected to url
Applies to: current tab

browser.clear_caches(id, title)

Clears in-memory cache
Applies to: all tabs

browser.set_plugins_enabled(enabled)

If set to False, all plugins will be disabled (Flash/Java/...)
Applies to: all tabs

browser.set_allow_popups(allow)

If set to False, popups will be blocked. If set to true, they will be opened in the current tab (no actual popup window)
Applies to: all tabs

browser.set_user_agent(useragent)

Sets the user agent string
Applies to: current tab

browser.clear_console()

Clears the current script-output console