Like standard fine-tuning, the predictor is also initialized with the pre-trained denoiser. Importantly, the denoiser is fixed to preserve output structure. We then propose composed fine-tuning, which trains a predictor composed with the pre-trained denoiser. We first show that standard fine-tuning after pre-training destroys some of this structure. Pre-training captures this structure by training a denoiser to denoise corrupted versions of unlabeled outputs. code on GitHub) and provide information about output validity. outputs without corresponding inputs, are freely available (e.g. While labeled input-output pairs are expensive to obtain, "unlabeled" outputs, i.e. pseudocode-to-code translation where the code must compile. We focus on prediction problems with structured outputs that are subject to output validity constraints, e.g.
0 Comments
Leave a Reply. |