Browser API
The complete API to control the browser is handled by one class, called "browser".
The browser-class is a global, omniscient instance that is available in all your scripts and modules.
You don't need to import anything to get access to it.
Many of these method-calls will be generated automatically when you hit the record-button.
Quicklinks:
Loading pages - Tab management - Browser input - Timeouts - Extracting Info - Content Modification - Browser setup
The Browser class specifies the following methods:
Loads the given URL in the current browser tab and waits until loading has finished.
If the page loads a lot of content dynamically, the page might not really be loaded completely after
load_page finishes. You might need to call time.sleep(seconds) to make sure everything is loaded correctly. In many cases, however, a simple call to load_page() is enough.
Applies to: current tab
browser.load_page_async(url)
The same as load_page but doesn't wait for the page to be loaded but returns immediately.
Can be useful to start loading pages in multiple tabs in parallel. Might be used in conjunction with
browser.is_loading() and browser.wait_for_load().
Applies to: current tab
browser.back()
Go one page back. The same functionality that's known from browsers.
Applies to: current tab
browser.forward()
Go one page forward, can be used if back() was called before. The same functionality that's know from browsers.
Applies to: current tab
Creates a new tab in the browser. Important: Browser-Tabs in AutoWWW are more or less independent.
Most notably, browser tabs do NOT share cookies and similar stuff. This means, every tab acts as a
different browser. So you can e.g. log into multiple accounts on the same page in different tabs.
Returns: A Tab-ID that can be used as argument for other functions such as select_page or remove_page.
browser.remove_page(tabId)
Removes the given tab. the tabId argument is the value that was returned by create_new_page before.
Returns: True if the tabId is valid, False otherwise.
Applies to: current tab
browser.select_page(tabId)
Switches tabs in the browser. the tabId argument is the value that was returned by create_new_page before.
If you want to switch to the first tab that existed when the browser was launched, use
browser.select_page(browser.default_page). browser.default_page can not be closed.
Returns: True if the tabId is valid, False otherwise.
browser.clear_pages()
Closes all tabs but the first one.
browser.get_num_pages()
Returns: the number of currently opened tabs.
browser.get_selected_page()
Returns: the tabId of the currently opened tab.
browser.set_tab_title(id, title)
Sets the Title of the given tab ID to title
Applies to: provided tab id
Simulates a mouse click at pixel x, y of the page (origin is in the top left corner)
Applies to: current tab
browser.click_element(element_id)
Simulates a click on a page element (such as a form-button) by providing its ID
(e.g. for stuff like <input type="submit" id="sb"/> would require a browser.click_element('sb') )
Returns: True if an element with the given ID was found. False otherwise
Applies to: current tab
browser.set_click_delay(seconds)
After click(), the browser always waits a bit to make sure that all JavaScript animations, etc. are finished before proceeding with script-execution. You may want to set this to 0 if you require faster clicking.
Applies to: complete browser
browser.set_focus_to(element_id)
Similarly to click_element, this function locates a page element (i.e. form fields) by ID and sets the input focus to the specified element. Often used in conjunction with write_text.
Applies to: current tab
browser.write_text(text)
Simulates the user writing text into the browser window. You might need to set input focus first to use this with
form-fields by using set_focus_to or simulating a click on the form element first.
Applies to: current tab
browser.press_key(keyId, keyModifiers, keyText)
Simulates a key press in the browser. It is highly recommended to use write_text instead whenever possible
because it's much easier to use.
Applies to: current tab
Sets the time that load_page() should wait for the page to load until it gives up in seconds.
This defaults to 30 seconds.
Applies to: complete browser
browser.get_load_timeout()
Returns: the timeout specified by set_load_timeout. Defaults to 30 seconds.
Applies to: complete browser
browser.has_error()
Returns: True if the last load_page() operation failed, False otherwise.
Applies to: complete browser
browser.is_loading()
Returns: True if the page is currently loading contents. This function can be useful in conjunction with
load_page_async. After a call to load_page_async, is_loading() will return True until the page has finished loading.
Applies to: current tab
browser.wait_for_load()
Waits until is_loading() returns False. Returns as soon as the page is completely loaded.
Applies to: current tab
Returns the Html code of the current page.
Applies to: current tab
browser.get_page_text()
Returns: the text that's currently readable on the screen. in an unspecified format.
Use 'print browser.get_page_text()' to see what this looks like.
NOTE: This function only returns the text of the frame that currently has focus.
On a page with multiple frames, you might want to click() into the correct frame first. to focus it.
Applies to: current tab
browser.page_contains(text)
Returns True if the given text is contained in the page's currently focused frame. Use click() to change focus in a different frame if required. Is equivalent to return text in browser.get_page_text().
Applies to: current tab
browser.save_image(src, path)
Will save an image from the website to 'path' on your file system.
The parameter 'src' has to be the image path that's written in <img src="..."> on the HTML page.
It can also be an extract of this path. The first image in the html document that contains the text given in 'src'
will be saved.
example: If the image you want is in the html code like this: <img src="http://example.com/img.png?dummy=foo">,
you might want to call browser.save_image('img.jpg', 'c:/foo.png') or browser.save_image('img.png?dummy=foo', 'c:/foo.png')
or browser.save_image('http://example.com/img.png?dummy=foo', 'c:/foo.png'), and so on.
Applies to: current tab
browser.save_as_image(x, y, path)
Will save the web-element at position x/y to the image to 'path' on your file system.
Applies to: current tab
Pass False as an argument to force the browser to not load images on web pages. Pass True to enable image loading again.
This can help to decrease traffic usage dramatically, especially when multiple AutoWWW instances are running constantly to do some work.
Applies to: complete browser
browser.mute_sound()
Mutes all HTML5 <audio> elements on the current page
Applies to: The current page. When navigating to a different page, sound will be loaded again.
browser.execute_javascript(jsCode)
Executes the JavaScript code given by jsCode in the context of the current page. This can be used to fill in forms, hide page elements, inject content and much more.
Applies to: current tab
Returns: The result of the last executed JavaScript expression. e.g. if you call browser.execute_javascript('var i = 3; i;'), the function would return 3.
Resizes the browser-window to the given width/height in pixels. Only the center part of the browser is resized. Not the window itself.
Applies to: complete browser
browser.scroll_to(x, y)
Scrolls to the given position on the page.
Applies to: current tab
browser.set_proxy(hostname, port, username, password)
Makes the currently selected page use the given SOCKS5 Proxy server. username and password might be empty.
Applies to: current tab
browser.set_status_text(text)
Writes the given text to the status-bar of the browser.
browser.show_messagebox(text)
Creates a message box with the given text in the browser.
browser.clear_cookies()
Clears all browser cookies of the current tab
Applies to: current tab
browser.set_javascript_enabled(enabled)
Enables/disables JavaScript (pass True or False)
Applies to: all tabs
browser.set_blacklist()
Pass a string-list containing regexes to be blocked
Applies to: current tab
browser.set_whitelist(list)
Pass a string-list containing regexes to be exceptions for the ones provided to set_blacklist
Applies to: current tab
browser.set_request_overrides(overrides)
Pass a dict RegExp->Url to override requests matching the regexp to be redirected to url
Applies to: current tab
browser.clear_caches(id, title)
Clears in-memory cache
Applies to: all tabs
browser.set_plugins_enabled(enabled)
If set to False, all plugins will be disabled (Flash/Java/...)
Applies to: all tabs
browser.set_allow_popups(allow)
If set to False, popups will be blocked. If set to true, they will be opened in the current tab (no actual popup window)
Applies to: all tabs
browser.set_user_agent(useragent)
Sets the user agent string
Applies to: current tab
browser.clear_console()
Clears the current script-output console