#
NovelAI
#
API Key
To get your NovelAI API key, follow these steps:
Select the gear icon at the top of the left sidebar.
Select "Account" under "User Settings".
Select "Get Persistent API Token".
Select the copy icon to copy your NovelAI API token to the clipboard.
#
Models
You should use Kayra.
Clio is not a bad model, but not as powerful as Kayra. Clio's speed advantage is insignificant. On NovelAI's tablet and scroll tiers, Clio does have a larger context size than Kayra, but trading that off against better coherence/prose quality from Kayra is unlikely to be worth it.
Krake and Euterpe aren't recommended - NovelAI even refers to them as legacy models.
#
Settings
The files with the settings are here (SillyTavern\public\NovelAI Settings). You can also manually add your own settings files.
#
Response Length
How much text you want to generate per message. Note that NovelAI has a limit of 150 tokens per response.
#
Context Size
How many tokens of the chat are kept in the context at any given time. How large the maximum context size you can use depends on the model and your subscription tier:
- Kayra (tablet) - 3072 tokens
- Kayra (scroll) - 6144 tokens
- Kayra (opus) and Clio (all tiers) - 8192 tokens
#
Temperature
- Lower value - the answers are more logical, but less creative.
- Higher value - the answers are more creative, but less logical.
#
Repetition penalty
Higher values make the output less repetitive. If the character is fixated on something or repeats the same phrase, then increasing this parameter will (likely) fix it. It is not recommended to increase this parameter too much as it may break the outputs.
#
Repetition penalty range
How many tokens from the last generated token will be considered repeated if they appear in the output.
#
Preamble
Text that is inserted right above the chat to modify the writing style. The recommended format is a list of short tags, like "[ Style: chat, detailed, sensory ]".
#
Top P
Limits the token pool to however many tokens it takes for their probabilities to add up to p. A lower number is more consistent, but less creative.
#
Top K
Limits the token pool to the k most likely tokens. A lower number is more consistent, but less creative.