att_viz package
Submodules
att_viz.attention_aggregation_method module
- class att_viz.attention_aggregation_method.AttentionAggregationMethod(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
Bases:
Enum
Represents the possible attention aggregation methods. The supported aggregation methods are:
NONE: no dimension collapses
HEADWISE_AVERAGING: head dimension is collapsed
- HEADWISE_AVERAGING = 2
Represents the headwise averaging aggregation method - the head dimension of the attention matrix collapses, while the layer dimension is kept.
- NONE = 1
Represents the empty aggregation method - all layer and head dimensions of the attention matrix are kept.
att_viz.attention_matrix module
- class att_viz.attention_matrix.AttentionMatrix(attention_matrix)
Bases:
object
Represents a self-attention matrix, recording the number of layers and heads, as well as whether it has been formatted for visualization or not.
- format(aggr_method: AttentionAggregationMethod, zero_first_attention: bool) None
Formats the wrapped attention matrix for HTML visualization, aggregating it based on the specified aggregation method.
- Parameters:
aggr_method – the aggregation method of the attention matrix. See AttentionAggregationMethod.
zero_first_attention – whether to ignore self attention values towards the first token.
att_viz.renderer module
- class att_viz.renderer.RenderConfig(y_margin: float = 30, line_length: float = 770, num_chars_block: float = 11, token_width: float = 110, min_token_width: float = 20, token_height: float = 22.5, x_margin: float = 20, matrix_width: float = 115)
Bases:
object
Rendering configuration class for specifying JavaScript preferences.
- class att_viz.renderer.Renderer(render_config: RenderConfig, aggregation_method: AttentionAggregationMethod = AttentionAggregationMethod.NONE)
Bases:
object
Renderering class for visualizing self-attention matrices.
- create_token_info(tokens: list[str]) tuple[list[tuple[int, int, int, int]], float]
Used for js visualization. Computes the (x, y) coordinates for a list of tokens.
- Parameters:
tokens – an array of tokens
- Returns:
a pair (res, dy) where res contains the (x, y) coordinates of the given tokens, and dy is the total height of the computed token sequence.
- render(tokens: list[str], prompt_length: int, attention_matrix: AttentionMatrix, prettify_tokens: bool = True, render_in_chunks: bool = True) None
Creates and saves one or more interactive HTML visualizations of the given attention matrix.
- Parameters:
tokens – the list of tokens of the prompt and model completion
prompt_length – the length of the prompt in tokens
attention_matrix – a formatted AttentionMatrix (see AttentionMatrix.format)
prettify_tokens – indicates whether to remove special characters in tokens, e.g. Ġ. (default True)
render_in_chunks – indicates whether to render in chunks or not (default True)
att_viz.self_attention_model module
- class att_viz.self_attention_model.SelfAttentionModel(model_name_or_directory: str)
Bases:
object
Wrapper for a self-attention model.
Loads and stores the model and its corresponding tokenizer.
- generate_text(prompt: str, max_new_tokens: int = 512, save_prefix: str | None = None, prompt_template: str | None = 'user\n{p}<|endoftext|>\nassistant\n') tuple[list[str], AttentionMatrix, int]
Generates text and returns the completion, attention matrix, and prompt length (in tokens).
- Args:
prompt: the prompt to use for text generation
max_new_tokens: the maximum number of tokens to be generated
save_prefix: the prefix to use if saving the computation results (default None)
prompt_template: the prompt template to use for text generation (default: `”user
{p}<|endoftext|> assistant “`)
- Returns:
the generated completion (as a string and as a list of tokens), attention matrix, and prompt length (in tokens)
- load_model(model_name_or_directory: str) tuple[AutoModelForCausalLM, AutoTokenizer]
Loads and returns a HuggingFace pretrained model and the corresponding tokenizer.
- Parameters:
model_name_or_directory – the name of the model to load, or alternatively the directory from which to load the model
- Returns:
the loaded model and tokenizer
att_viz.utils module
- class att_viz.utils.Experiment(model: SelfAttentionModel, renderer: Renderer)
Bases:
object
A simple attention visualization experiment.
- basic_experiment(prompt: str, aggr_method: AttentionAggregationMethod) None
A simple text generation experiment, in which the resulting prompt-completion pair is visualized with self-attention information in HTML format.
- Parameters:
prompt – the prompt to use for text generation
aggr_method – the aggregation method of the attention matrix. See AttentionAggregationMethod
- att_viz.utils.process_saved_completions(render_config: RenderConfig, aggregation_method: AttentionAggregationMethod, save_prefixes: list[str]) None
Render inference results obtained using save_completions.
- Parameters:
render_config – the rendering configuration. See RenderConfig.
aggregation_method – the aggregation method of the attention matrix. See AttentionAggregationMethod
save_prefixes – the list of save prefixes that have been used for storing inference results
- att_viz.utils.save_completions(model_name_or_directory: str, prompts: list[str], save_prefixes: list[str]) None
Load a self-attention model and do inference for the given prompts.
- Parameters:
model_name_or_directory – the name of the model to load, or alternatively the directory from which to load the model
prompts – the list of prompts to use for text generation
save_prefixes – the list of save prefixes to use for storing inference results - should have the same length as prompts