econlearn.tile¶
-
class
econlearn.tile.
TilecodeDensity
(D, T, L, mem_max=1, offset='optimal', cores=1)[source]¶ Tile coding approximation of the pdf of X Fits by averaging. Supports multi-core fit and predict. Options for uniform, random or ‘optimal’ displacement vectors.
Parameters: D : integer
Total number of input dimensions
T : list of integers, length D
Number of tiles per dimension
L : integer
Number of tiling ‘layers’
mem_max : double, (default=1)
Proportion of tiles to store in memory: less than 1 means hashing is used.
min_sample : integer, (default=50)
Minimum number of observations per tile
offset : string, (default=’uniform’)
Type of displacement vector, one of ‘uniform’, ‘random’ or ‘optimal’
Attributes
tile (Tilecode instance)
-
class
econlearn.tile.
TilecodeNearestNeighbour
(D, L, mem_max=1, cores=1, offset='optimal')[source]¶ Fast approximate nearest neighbour search using tile coding data structure
Parameters: D : int,
Number of input dimensions
L : int,
Number of tilings or ‘layers’
mem_max : float, optional (default = 1)
Tile array size, values less than 1 turn on hashing
cores : int, optional (default=1)
Number of CPU cores to use (fitting stage is parallelized)
offset : {‘optimal’, ‘random’, ‘uniform’}, (default=’optimal’) optional
Type of displacement vector used
Notes
This is an approximate method: it is possible that some points > than radius may be included and some < than radius may be excluded.
-
fit
(X, radius, prop=1)[source]¶ Fit a tile coding data structure to X
Parameters: X : array of shape [N, D]
Input data (unscaled)
radius : float
radius for nearest neighbor queries. Tile widths for each dimension of X are int((b[i] - a[i]) / radius) where b and a are the max and min values of X[:,i].
-
predict
(X, thresh=1)[source]¶ Obtain nearest neighbors (points within distance radius)
Parameters: X : array of shape [N, D]
Query points
thresh : int, (default=1)
Only include points if they are active in at least thresh layers (max is L) Higher thresh values will tend to exclude the points furthest from the query point
Returns: Y : list of arrays (length = N)
Nearest neighbors for each query point
-
-
class
econlearn.tile.
TilecodeQVIteration
(D, T, L, radius, beta, ms=1, mem_max=1, cores=1, ASGD=True, linT=6)[source]¶ Solve a MDP with 1 policy variable and D state variables by Q-V Iteration
Parameters: D : int,
Number of state variables
T : list of integers, length D
Number of tiles per dimension
L : int,
Number of tilings or ‘layers’
radius : float,
Radius for state space sample grid
beta : float in (0, 1),
Discount rate
ms : int, optional (default = 1)
Minimum samples per tile for the Q function
mem_max : float, optional (default = 1)
Tile array size, values less than 1 turns on hashing
cores : int, optional (default=1)
Number of CPU cores to use
ASGD : boolean, optional (default=True)
Fit Q function by ASGD
offset : {‘optimal’, ‘random’, ‘uniform’}, (default=’optimal’)
Type of displacement vector used
linT : integer, optional (default=6)
Number of linear spline knots per dimension
-
iterate
(XA, X1, R, A_low, A_high, ITER=50, plot=False, plotdim=0, output=True, a=0, b=0, pc_samp=1, eta=0.8, maxT=60000, tilesg=False, sg_points=100, sg_prop=0.96, sg_samp=1, sgmem_max=0.4)[source]¶ Perform QV iteration given a set of training data (N state-action and state transition samples) to derive optimal value and policy functions
Parameters: XA : array of shape [N, D + 1]
State-action samples (i.e., actions in first column, then state variables)
X1 : array of shape [N, D]
State transition samples (i.e., state at t+1)
R : array of shape [N,]
Payoff samples
A_low : array of shape [N,],
Lower feasible bound for action A conditional on X
A_high : array of shape [N,],
Upper feasible bound for action A conditional on X
ITER : int, optional (default = 50)
Number of iterations
plot : boolean, optional (default = True)
Whether to generate plots of the final value and policy function.
plotdim : int in [0, D], optional (default = 0)
Which state dimension to plot (other dimensions are held fixed at their mean values).
a : array, optional, shape=(D)
Percentile to use for minimum tiling domain (if not provided set to 0)
b : array, optional, shape=(D)
Percentile to use for maximum tiling domain (if not provided set to 100)
pc_samp : float, optional, (default=1)
Proportion of sample to use when calculating percentile ranges
output : boolean, optional (default=True)
Whether to print value function change updates each iteration
eta : float (default=.01)
ASGD / SGD learning rate
maxT : int, default (default=60000)
ASGD / SGD learning rate parameter
tilesg : boolean, (default=False)
If True then will use tilecoding to build state space sample grid else will use distance method. Tilecoding is preferred for large samples.
sg_points : int, (default=100)
If tilesg=False, then the number of points in the state space sample grid
sg_prop : float, (default=0.96)
If tilesg=True, then the proportion of points to include in the state space sample grid (set less than 1 to exclude outliers)
sg_samp : float, (default=0.5)
If tilesg=True, then the proportion of the sample points to use for the state space sample grid
sgmem_max : float, (default = 0.4)
If tilesg=True, then the mem_max (hashing) parameter of the sample grid tilecode scheme.
-
maximise
(grid, A_low, A_high, output=True)[source]¶ Maximises current Q-function for a subset of state space points and returns new value and policy functions
Parameters: grid : array, shape=(N, D)
State space grid
A_low : array, shape=(N,)
action lower bound
A_high : array, shape=(N,)
action upper bound
Returns: ERROR: float
Mean absolute deviation
-
resetQ
(D, T, L, mem_max=1, ms=1)[source]¶ Reset the Q function
Parameters: D : int,
Number of state variables
T : list of integers, length D
Number of tiles per dimension
L : int,
Number of tilings or ‘layers’
mem_max : float, optional (default = 1)
Tile array size, values less than 1 turns on hashing
ms : int, optional (default = 1)
Minimum samples per tile for the Q function
-
-
class
econlearn.tile.
TilecodeRegressor
(D, T, L, mem_max=1, min_sample=1, offset='optimal', lin_spline=False, linT=7, cores=4)[source]¶ Tile coding for function approximation (Supervised Learning). Fits by averaging and/or Stochastic Gradient Descent. Supports multi-core fit and predict. Options for uniform, random or ‘optimal’ displacement vectors. Provides option for linear spline extrapolation / filling
Parameters: D : integer
Total number of input dimensions
T : list of integers, length D
Number of tiles per dimension
L : integer
Number of tiling ‘layers’
mem_max : double, (default=1)
Proportion of tiles to store in memory: less than 1 means hashing is used.
min_sample : integer, (default=50)
Minimum number of observations per tile
offset : string, (default=’uniform’)
Type of displacement vector, one of ‘uniform’, ‘random’ or ‘optimal’
lin_spline : boolean, (default=False)
Use sparse linear spline model to extrapolate / fill empty tiles
linT : integer, (default=6)
Number of linear spline knots per dimension
Attributes
tile (Cython Tilecode instance) -
check_memory
()[source]¶ Provides information on the current memory usage of the tilecoding scheme. If memory usage is an issue call this function after fitting and then consider rebuilding the scheme with a lower mem_max parameter.
-
fit
(X, Y, method='A', score=False, copy=True, a=0, b=0, pc_samp=1, eta=0.01, n_iters=1, scale=0)[source]¶ Estimate tilecode weights. Supports `Averaging’, Stochastic Gradient Descent (SGD) and Averaged SGD.
Parameters: X : array, shape=(N, D)
Input data (unscaled)
Y : array, shape=(N)
Output data (unscaled)
method : string (default=’A’)
Estimation method, one of ‘A’ (for Averaging), ‘SGD’ or ‘ASGD’.
score : boolean, (default=False)
Calculate R-squared
copy : boolean (default=False)
Store X and Y
a : array, optional shape=(D)
Percentile to use for minimum tiling range (if not provided set to 0)
b : array, optional, shape=(D)
Percentile to use for maximum tiling range (if not provided set to 100)
pc_samp : float, optional, (default=1)
Proportion of sample to use when calculating percentile ranges
eta : float (default=.01)
SGD Learning rate
n_iters : int (default=1)
Number of passes over the data set in SGD
scale : float (default=0)
Learning rate scaling factor in SGD
-
-
class
econlearn.tile.
TilecodeSamplegrid
(D, L, mem_max=1, cores=1, offset='optimal')[source]¶ Construct a sample grid (sample of approximately equidistant points) from a large data set, using a tilecoding data structure
Parameters: D : int,
Number of input dimensions
L : int,
Number of tilings or ‘layers’
mem_max : float, optional (default = 1)
Tile array size, values less than 1 turn on hashing
cores : int, optional (default=1)
Number of CPU cores to use (fitting stage is parallelized)
offset : {‘optimal’, ‘random’, ‘uniform’}, optional
Type of displacement vector used
Notes
This is an approximate method: it is possible that the resulting sample will contain some points less than
radius
distance apart. The accuracy improves with the number of layersL
.Currently the tile widths are defined as
int((b - a) / radius)**-1
, so small changes in radius may have no effect.-
fit
(X, radius, prop=1)[source]¶ Fit a density function to X and return a sample grid with a maximum of M points
Parameters: X : array of shape [N, D]
Input data (unscaled)
radius : float
minimum distance between points. This determines tile widths.
prop : float in (0, 1), optional (default=1.0)
Proportion of sample points to return (lowest density points are excluded)
Returns: GRID, array of shape [M, D]
The sample grid with M < N points
-