Developer interface

These are the interfaces that PyPAC exposes to developers.

Main interface

These are the most commonly used components of PyPAC.

class pypac.PACSession(pac=None, proxy_auth=None, pac_enabled=True, response_proxy_fail_filter=None, exception_proxy_fail_filter=None, socks_scheme='socks5', **kwargs)[source]

A PAC-aware Requests Session that discovers and complies with a PAC file, without any configuration necessary. PAC file discovery is accomplished via the Windows Registry (if applicable), and the Web Proxy Auto-Discovery (WPAD) protocol. Alternatively, a PAC file may be provided in the constructor.

Parameters:
  • pac (PACFile) – The PAC file to consult for proxy configuration info. If not provided, then upon the first request, get_pac() is called with default arguments in order to find a PAC file.
  • proxy_auth (requests.auth.HTTPProxyAuth) – Username and password proxy authentication.
  • pac_enabled (bool) – Set to False to disable all PAC functionality, including PAC auto-discovery.
  • response_proxy_fail_filter – Callable that takes a requests.Response and returns a boolean for whether the response means the proxy used for the request should no longer be used. By default, the response is not inspected.
  • exception_proxy_fail_filter – Callable that takes an exception and returns a boolean for whether the exception means the proxy used for the request should no longer be used. By default, requests.exceptions.ConnectTimeout and requests.exceptions.ProxyError are matched.
  • socks_scheme (str) – Scheme to use when PAC file returns a SOCKS proxy. socks5 by default.
do_proxy_failover(proxy_url, for_url)[source]
Parameters:
  • proxy_url (str) – Proxy to ban.
  • for_url (str) – The URL being requested.
Returns:

The next proxy config to try, or ‘DIRECT’.

Raises:

ProxyConfigExhaustedError – If the PAC file provided no usable proxy configuration.

get_pac(**kwargs)[source]

Search for, download, and parse PAC file if it hasn’t already been done. This method is called upon the first use of request(), but can also be called manually beforehand if desired. Subsequent calls to this method will only return the obtained PAC file, if any.

Returns:The obtained PAC file, if any.
Return type:PACFile|None
Raises:MalformedPacError – If something that claims to be a PAC file was downloaded but could not be parsed.
pac_enabled = None

Set to False to disable all PAC functionality, including PAC auto-discovery.

proxy_auth

Proxy authentication object.

request(method, url, proxies=None, **kwargs)[source]
Raises:
pypac.get_pac(url=None, js=None, from_os_settings=True, from_dns=True, timeout=2, allowed_content_types=None, session=None, **kwargs)[source]

Convenience function for finding and getting a parsed PAC file (if any) that’s ready to use.

Parameters:
  • url (str) – Download PAC from a URL. If provided, from_os_settings and from_dns are ignored.
  • js (str) – Parse the given string as a PAC file. If provided, from_os_settings and from_dns are ignored.
  • from_os_settings (bool) – Look for a PAC URL or filesystem path from the OS settings, and use it if present. Doesn’t do anything on non-Windows or non-macOS/OSX platforms.
  • from_dns (bool) – Look for a PAC file using the WPAD protocol.
  • timeout – Time to wait for host resolution and response for each URL.
  • allowed_content_types – If the response has a Content-Type header, then consider the response to be a PAC file only if the header is one of these values. If not specified, the allowed types are application/x-ns-proxy-autoconfig and application/x-javascript-config.
  • session (requests.Session) – Used for getting potential PAC files. If not specified, a generic session is used.
Returns:

The first valid parsed PAC file according to the criteria, or None if nothing was found.

Return type:

PACFile|None

Raises:

MalformedPacError – If something that claims to be a PAC file was obtained but could not be parsed.

pypac.collect_pac_urls(from_os_settings=True, from_dns=True, **kwargs)[source]

Get all the URLs that potentially yield a PAC file.

Parameters:
  • from_os_settings (bool) – Look for a PAC URL from the OS settings. If a value is found and is a URL, it comes first in the returned list. Doesn’t do anything on non-Windows or non-macOS/OSX platforms.
  • from_dns (bool) – Assemble a list of PAC URL candidates using the WPAD protocol.
Returns:

A list of URLs that should be tried in order.

Return type:

list[str]

pypac.download_pac(candidate_urls, timeout=1, allowed_content_types=None, session=None)[source]

Try to download a PAC file from one of the given candidate URLs.

Parameters:
  • candidate_urls (list[str]) – URLs that are expected to return a PAC file. Requests are made in order, one by one.
  • timeout – Time to wait for host resolution and response for each URL. When a timeout or DNS failure occurs, the next candidate URL is tried.
  • allowed_content_types – If the response has a Content-Type header, then consider the response to be a PAC file only if the header is one of these values. If not specified, the allowed types are application/x-ns-proxy-autoconfig and application/x-javascript-config.
  • session (requests.Session) – Used for getting potential PAC files. If not specified, a generic session is used.
Returns:

Contents of the PAC file, or None if no URL was successful.

Return type:

str|None

pypac.pac_context_for_url(url, proxy_auth=None, pac=None)[source]

This context manager provides a simple way to add rudimentary PAC functionality to code that cannot be modified to use PACSession, but obeys the HTTP_PROXY and HTTPS_PROXY environment variables.

Upon entering this context, PAC discovery occurs with default parameters. If a PAC is found, then it’s asked for the proxy to use for the given URL. The proxy environment variables are then set accordingly.

Note that this provides a very simplified PAC experience that’s insufficient for some scenarios.

Parameters:
  • url – Consult the PAC for the proxy to use for this URL.
  • proxy_auth (requests.auth.HTTPProxyAuth) – Username and password proxy authentication.
  • pac (PACFile) – The PAC to use to resolve the proxy. If not provided, get_pac() is called with default arguments in order to find a PAC file.

PAC parsing and execution

Functions and classes for parsing and executing PAC files.

class pypac.parser.PACFile(pac_js, **kwargs)[source]

Represents a PAC file.

JavaScript parsing and execution is handled by the dukpy library.

Load a PAC file from a given string of JavaScript. Errors during parsing and validation may raise a specialized exception.

Parameters:pac_js (str) – JavaScript that defines the FindProxyForURL() function.
Raises:MalformedPacError – If the JavaScript could not be parsed, does not define FindProxyForURL(), or is otherwise invalid.
find_proxy_for_url(url, host)[source]

Call FindProxyForURL() in the PAC file with the given arguments.

Parameters:
  • url (str) – The full URL.
  • host (str) – The URL’s host.
Returns:

Result of evaluating the FindProxyForURL() JavaScript function in the PAC file.

Return type:

str

pypac.parser.parse_pac_value(value, socks_scheme=None)[source]

Parse the return value of FindProxyForURL() into a list. List elements will either be the string “DIRECT” or a proxy URL.

For example, the result of parsing PROXY example.local:8080; DIRECT is a list containing strings http://example.local:8080 and DIRECT.

Parameters:
  • value (str) – Any value returned by FindProxyForURL().
  • socks_scheme (str) – Scheme to assume for SOCKS proxies. socks5 by default.
Returns:

Parsed output, with invalid elements ignored. Warnings are logged for invalid elements.

Return type:

list[str]

pypac.parser.proxy_url(value, socks_scheme=None)[source]

Parse a single proxy config value from FindProxyForURL() into a more usable element.

The recognized keywords are DIRECT, PROXY, SOCKS, SOCKS4, and SOCKS5. See https://developer.mozilla.org/en-US/docs/Web/HTTP/Proxy_servers_and_tunneling/Proxy_Auto-Configuration_PAC_file#return_value_format

Parameters:
  • value (str) – Value to parse, e.g.: DIRECT, PROXY example.local:8080, or SOCKS example.local:8080.
  • socks_scheme (str) – Scheme to assume for SOCKS proxies. socks5 by default.
Returns:

Parsed value, e.g.: DIRECT, http://example.local:8080, or socks5://example.local:8080.

Return type:

str

Raises:

ValueError – If input value is invalid.

class pypac.parser.MalformedPacError(msg=None, original_exc=None)[source]

PAC JavaScript functions

Python implementations of JavaScript functions needed to execute a PAC file.

These are injected into the JavaScript execution context. They aren’t meant to be called directly from Python, so the function signatures may look unusual.

Most docstrings below are adapted from http://findproxyforurl.com/netscape-documentation/.

pypac.parser_functions.alert(_)[source]

No-op. PyPAC ignores JavaScript alerts.

pypac.parser_functions.dateRange(*args)[source]

Accepted forms:

  • dateRange(day)
  • dateRange(day1, day2)
  • dateRange(mon)
  • dateRange(month1, month2)
  • dateRange(year)
  • dateRange(year1, year2)
  • dateRange(day1, month1, day2, month2)
  • dateRange(month1, year1, month2, year2)
  • dateRange(day1, month1, year1, day2, month2, year2)
  • dateRange(day1, month1, year1, day2, month2, year2, gmt)
day
is the day of month between 1 and 31 (as an integer).
month
is one of the month strings:
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
year
is the full year number, for example 1995 (but not 95). Integer.
gmt
is either the string “GMT”, which makes time comparison occur in GMT timezone; if left unspecified, times are taken to be in the local timezone.

Even though the above examples don’t show, the “GMT” parameter can be specified in any of the 9 different call profiles, always as the last parameter.

If only a single value is specified (from each category: day, month, year), the function returns a true value only on days that match that specification. If both values are specified, the result is true between those times, including bounds.

Return type:bool
pypac.parser_functions.dnsDomainIs(host, domain)[source]
Parameters:
  • host (str) – is the hostname from the URL.
  • domain (str) – is the domain name to test the hostname against.
Returns:

true iff the domain of hostname matches.

Return type:

bool

pypac.parser_functions.dnsDomainLevels(host)[source]
Parameters:host (str) – is the hostname from the URL.
Returns:the number (integer) of DNS domain levels (number of dots) in the hostname.
Return type:int
pypac.parser_functions.dnsResolve(host)[source]

Resolves the given DNS hostname into an IP address, and returns it in the dot separated format as a string. Returns an empty string if there is an error

Parameters:host (str) – hostname to resolve
Returns:Resolved IP address, or empty string if resolution failed.
Return type:str
pypac.parser_functions.isInNet(host, pattern, mask)[source]

Pattern and mask specification is done the same way as for SOCKS configuration.

Parameters:
  • host (str) – a DNS hostname, or IP address. If a hostname is passed, it will be resolved into an IP address by this function.
  • pattern (str) – an IP address pattern in the dot-separated format
  • mask (str) – mask for the IP address pattern informing which parts of the IP address should be matched against. 0 means ignore, 255 means match.
Returns:

True iff the IP address of the host matches the specified IP address pattern.

Return type:

bool

pypac.parser_functions.isPlainHostName(host)[source]
Parameters:host (str) – the hostname from the URL (excluding port number).
Returns:True iff there is no domain name in the hostname (no dots).
Return type:bool
pypac.parser_functions.isResolvable(host)[source]

Tries to resolve the hostname.

Parameters:host (str) – is the hostname from the URL.
Returns:true if succeeds.
Return type:bool
pypac.parser_functions.localHostOrDomainIs(host, hostdom)[source]
Parameters:
  • host (str) – the hostname from the URL.
  • hostdom (str) – fully qualified hostname to match against.
Returns:

true if the hostname matches exactly the specified hostname, or if there is no domain name part in the hostname, but the unqualified hostname matches.

Return type:

bool

pypac.parser_functions.myIpAddress()[source]
Returns:the IP address of the host that the Navigator is running on, as a string in the dot-separated integer format.
Return type:str
pypac.parser_functions.shExpMatch(host, pattern)[source]

Case-insensitive host comparison using a shell expression pattern.

Parameters:
  • host (str) –
  • pattern (str) – Shell expression pattern to match against.
Return type:

bool

pypac.parser_functions.timeRange(*args)[source]

Accepted forms:

  • timeRange(hour)
  • timeRange(hour1, hour2)
  • timeRange(hour1, min1, hour2, min2)
  • timeRange(hour1, min1, sec1, hour2, min2, sec2)
  • timeRange(hour1, min1, sec1, hour2, min2, sec2, gmt)
hour
is the hour from 0 to 23. (0 is midnight, 23 is 11 pm.)
min
minutes from 0 to 59.
sec
seconds from 0 to 59.
gmt
either the string “GMT” for GMT timezone, or not specified, for local timezone. Again, even though the above list doesn’t show it, this parameter may be present in each of the different parameter profiles, always as the last parameter.
Returns:True during (or between) the specified time(s).
Return type:bool
pypac.parser_functions.weekdayRange(start_day, end_day=None, gmt=None)[source]

Accepted forms:

  • weekdayRange(wd1)
  • weekdayRange(wd1, gmt)
  • weekdayRange(wd1, wd2)
  • weekdayRange(wd1, wd2, gmt)

If only one parameter is present, the function yields a true value on the weekday that the parameter represents. If the string “GMT” is specified as a second parameter, times are taken to be in GMT, otherwise in local timezone.

If both wd1 and wd2`` are defined, the condition is true if the current weekday is in between those two weekdays. Bounds are inclusive. If the gmt parameter is specified, times are taken to be in GMT, otherwise the local timezone is used.

Weekday arguments are one of MON TUE WED THU FRI SAT SUN.

Parameters:
  • start_day (str) – Weekday string.
  • end_day (str) – Weekday string.
  • gmt (str) – is either the string: GMT or is left out.
Return type:

bool

Proxy resolution

Tools for working with a given PAC file and its return values.

class pypac.resolver.ProxyResolver(pac, proxy_auth=None, socks_scheme='socks5')[source]

Handles the lookup of the proxy to use for any given URL, including proxy failover logic.

Parameters:
  • pac (pypac.parser.PACFile) – Parsed PAC file.
  • proxy_auth (requests.auth.HTTPProxyAuth) – Username and password proxy authentication. If provided, then all proxy URLs returned will include these credentials.
  • socks_scheme (str) – Scheme to assume for SOCKS proxies. socks5 by default.
pypac.resolver.add_proxy_auth(possible_proxy_url, proxy_auth)[source]

Add a username and password to a proxy URL, if the input value is a proxy URL.

Parameters:
Returns:

Proxy URL with auth info added, or DIRECT.

Return type:

str

pypac.resolver.proxy_parameter_for_requests(proxy_url_or_direct)[source]
Parameters:proxy_url_or_direct (str) – Proxy URL, or DIRECT. Cannot be empty.
Returns:Value for use with the proxies parameter in Requests.
Return type:dict
class pypac.resolver.ProxyConfigExhaustedError(for_url)[source]

WPAD functions

Tools for the Web Proxy Auto-Discovery Protocol.

pypac.wpad.proxy_urls_from_dns(local_hostname=None)[source]

Generate URLs from which to look for a PAC file, based on a hostname. Fully-qualified hostnames are checked against the Public Suffix List to ensure that generated URLs don’t go outside the scope of the organization. If the fully-qualified hostname doesn’t have a recognized TLD, such as in the case of intranets with ‘.local’ or ‘.internal’, the TLD is assumed to be the part following the rightmost dot.

Parameters:local_hostname (str) – Hostname to use for generating the WPAD URLs. If not provided, the local hostname is used.
Returns:PAC URLs to try in order, according to the WPAD protocol. If the hostname isn’t qualified or is otherwise invalid, an empty list is returned.
Return type:list[str]
pypac.wpad.wpad_search_urls(subdomain_or_host, fld)[source]

Generate URLs from which to look for a PAC file, based on the subdomain and TLD parts of a fully-qualified host name.

Parameters:
  • subdomain_or_host (str) – Subdomain portion of the fully-qualified host name. For foo.bar.example.com, this is foo.bar.
  • fld (str) – FLD portion of the fully-qualified host name. For foo.bar.example.com, this is example.com.
Returns:

PAC URLs to try in order, according to the WPAD protocol.

Return type:

list[str]

OS stuff

Tools for getting the configured PAC file URL out of the OS settings.

exception pypac.os_settings.NotDarwinError[source]
exception pypac.os_settings.NotWindowsError[source]
pypac.os_settings.ON_DARWIN = False

True if running on macOS/OSX.

pypac.os_settings.ON_WINDOWS = False

True if running on Windows.

pypac.os_settings.autoconfig_url_from_preferences()[source]

Get the PAC AutoConfigURL value from the macOS System Preferences. This setting is visible as the “URL” field in System Preferences > Network > Advanced… > Proxies > Automatic Proxy Configuration.

Returns:The value from the registry, or None if the value isn’t configured or available. Note that it may be local filesystem path instead of a URL.
Return type:str|None
Raises:NotDarwinError – If called on a non-macOS/OSX platform.
pypac.os_settings.autoconfig_url_from_registry()[source]

Get the PAC AutoConfigURL value from the Windows Registry. This setting is visible as the “use automatic configuration script” field in Internet Options > Connection > LAN Settings.

Returns:The value from the registry, or None if the value isn’t configured or available. Note that it may be local filesystem path instead of a URL.
Return type:str|None
Raises:NotWindowsError – If called on a non-Windows platform.
pypac.os_settings.file_url_to_local_path(file_url)[source]

Parse a AutoConfigURL value with file:// scheme into a usable local filesystem path.

Parameters:file_url – Must start with file://.
Returns:A local filesystem path. It might not exist.
class pypac.os_settings.NotWindowsError[source]
class pypac.os_settings.NotDarwinError[source]