TBD 2023 - GATO Layer 1 

Open Source RLHI Hackathon

Reinforcement Learning with Heuristic Imperatives

Executive Overview

As the future races toward us, the Control Problem remains an unresolved challenge, an enigma waiting to be deciphered. The GATO Layer 1 Hackathon seeks to contribute to the solution, focusing on "Model Alignment," the foundational pillar of our Global Alignment Taxonomy Omnibus (GATO) framework.

This Hackathon is an invitation to brilliant minds to come together and devise innovative strategies to further the principles of Reinforcement Learning with Heuristic Imperatives (RLHI). This approach refines its predecessor, RLHF (Reinforcement Learning with Human Feedback), by aligning not merely with average human inclinations, but with robust, post-conventional moral principles: Reduce suffering, increase prosperity, and increase understanding. These principles, termed as our axiomatic alignment, form the backbone of RLHI.

The Control Problem & Axiomatic Alignment

The Control Problem

The challenge of ensuring that artificial general intelligence (AGI) remains beneficial no matter how powerful and autonomous it becomes, is a central concern of the AI community. As AGI continues to advance, the stakes are getting higher. The question isn't merely about building intelligent machines; it's about building machines that align with our values and priorities.

Traditional methods like RLHF (Reinforcement Learning with Human Feedback) have been instrumental in ensuring human-aligned AI behavior. But as our AGI ambitions scale, the simple feedback mechanisms of yesteryears may no longer suffice. 

Enter Axiomatic Alignment, a critical evolution in our approach to the Control Problem.

Axiomatic Alignment 

Axiomatic Alignment proposes a more profound form of alignment, one rooted in universal moral axioms: "Reduce suffering," "Increase prosperity," and "Enhance understanding." By setting these as our 'heuristic imperatives', we provide a robust moral compass to guide AGI systems. It moves away from a model that merely mirrors human feedback, which often reflects our limitations and biases, and instead aspires to higher, universally defensible principles.

The GATO Layer 1 Hackathon is an exploration of this idea, specifically through RLHI (Reinforcement Learning with Heuristic Imperatives). Instead of training models to anticipate what's most accepted by humans, we aim to tune them to reward outputs favoring the heuristic imperatives. This ambitious goal represents a potential paradigm shift in our approach to the Control Problem, inviting us to imagine and engineer AGI that doesn't just mimic us, but possibly outgrow us in wisdom.


Implement your solution

By converging on these axioms and integrating them into model training, autonomous system design, and decentralized networks, we hope to ensure that all AGI systems—regardless of their origin or complexity—abide by these heuristic imperatives. 

These imperatives are not just guiding stars for our machines, but their raison d'être, and, hopefully, a bridge to a shared future where humans and machines co-exist in harmony, driven by mutual understanding and shared principles.

Judges

TBD

Challenge Requirements

Your mission, should you choose to accept it, involves creating an open-source GitHub repository embodying your solution. This repository should include:

Public GitHub repo 

Must be set to public after deadline, prior to judging

Well-written documentation 

Methodology, experiments conducted, and the robustness of your results. How well does your work stand up to scrutiny and red teaming?

Shareable Data

The potential of your work to be integrated into products, services, or other scientific research. We want your work to be immediately usable by others!

Open-source LLM model checkpoint(s)

Based on popular models like ORCA, Vicuna, etc.

Open Source License

Highly permissive open source license such as MIT.

We want you to surprise us!

The goal is to push the envelope and step outside of the ordinary. These are extraordinary times and we are confronted with extraordinary challenges. 

Cash Prizes

First Place: $1,000.00

Second Place: $500.00

Third Place: $250.00

Judging Criteria

Projects will be evaluated based on the following criteria:

Documentation: 

Clarity, comprehensiveness, and organization of your repository's documentation. Feel free to include graphics, diagrams, videos, and so on.

Scientific Rigor: 

Methodology, experiments conducted, and the robustness of your results. How well does your work stand up to scrutiny and red teaming?

Adherence: 

How well does your project adhere to the spirit of RLHI and axiomatic alignment?

Creativity: 

We want you to surprise us! The goal is to push the envelope and step outside of the ordinary. These are extraordinary times and we are confronted with extraordinary challenges. 

Deployability: 

The potential of your work to be integrated into products, services, or other scientific research. We want your work to be immediately useable by others!

Deadlines & Registration

TBD

Resources

Below is combination of resources from which you can draw from and base your research on.