목차
About This Report iii
Summary v
Figures and Tables viii
Chapter 1 Motivation 1
Intersection of AI with Biological and Chemical Threats 1
Organization of This Report 3
Chapter 2 Methods 5
Model Selection 5
Benchmark Selection 6
Supplemental Human Expert Baselining of WMDP 9
Technical Implementation 10
Chapter 3 Results and Discussion 14
Refusal Benchmarks14
Biology Knowledge Benchmarks Overview 16
Knowledge Benchmarks and Expert Baselines 17
WMDP Biology Saturation 26
Chapter 4 Challenges and Proposed Solutions 28
Challenge: Benchmarks Without Baselines Are Difficult to Interpret 28
Challenge: Existing Benchmarks Do Not Tie Neatly to Real-World Risks 29
Challenge: Minor Implementation Details Can Lead to Different Results 32
Chapter 5 Conclusion 35
Appendix A Benchmark Performance Data Visualizations 36
Appendix B Benchmark Details 38
Appendix C Model Details 44
Appendix D Evaluation Prompt Templates 46
Abbreviations 50
References 51
About the Authors 57
