Humanity Unleashed
Putting AI in Service of the World
Appendix
Appendix
Appendix
A. Artificial Superintelligence and Existential Risk
A. Artificial Superintelligence and Existential Risk
A. Artificial Superintelligence and Existential Risk
We try not to dwell on the existential nature of the impending emergence of Artificial Superintelligence (ASI), as it is an overwhelming reality to face. In framing why the development of Social Environmental Design is so critical and time-sensitive, however, this context is critical. It is impossible to overstate the risks that ASI would pose to a society that has not moved beyond its dysfunctional and combative tendencies. Creating AIs that are both oriented toward dominance of humans and beyond human capabilities to stop is both incredibly dangerous and incredibly likely given current societal power structures and their orientation.
The AI principal policymaker of SED need not be an ASI to be orders of magnitude more effective than our existing approaches to fostering cooperation and value alignment. It is exactly that cooperation and value alignment that will be needed to allow actors around the world to coordinate to prevent the emergence of a misaligned Superintelligence that can threaten the survival of life on earth.
A particularly advantaged member of our society might be convinced that they can fully insulate themselves from the impacts of other cascading, systemic failures such as climate change, and so may not feel that any significant modification to the status quo is needed. Superintelligent AI changes that calculus. Although it is counterintuitive based on past human experience, as AI capacities expand, there are only three likely endstates:
A global, pluralistic society that places collective wellbeing as the top priority through an AI/human hybrid governance model of the type we describe,
A global, totalitarian state that entrenches the interests of a single group or individual by using AI as an all-powerful enforcer, or
Human extinction due to such an AI that is misaligned and turns on its controllers, most likely a successful or aspiring such totalitarian state.
It is highly unlikely that any single individual, no matter how presently advantaged, would be better off, or even likely to survive, in a world with Superintelligence and without established, effective global governance based on pluralistic values. This is the case even assuming they have no issue surrendering their agency or misguidedly think that they individually might be able to rule the world.
We try not to dwell on the existential nature of the impending emergence of Artificial Superintelligence (ASI), as it is an overwhelming reality to face. In framing why the development of Social Environmental Design is so critical and time-sensitive, however, this context is critical. It is impossible to overstate the risks that ASI would pose to a society that has not moved beyond its dysfunctional and combative tendencies. Creating AIs that are both oriented toward dominance of humans and beyond human capabilities to stop is both incredibly dangerous and incredibly likely given current societal power structures and their orientation.
The AI principal policymaker of SED need not be an ASI to be orders of magnitude more effective than our existing approaches to fostering cooperation and value alignment. It is exactly that cooperation and value alignment that will be needed to allow actors around the world to coordinate to prevent the emergence of a misaligned Superintelligence that can threaten the survival of life on earth.
A particularly advantaged member of our society might be convinced that they can fully insulate themselves from the impacts of other cascading, systemic failures such as climate change, and so may not feel that any significant modification to the status quo is needed. Superintelligent AI changes that calculus. Although it is counterintuitive based on past human experience, as AI capacities expand, there are only three likely endstates:
A global, pluralistic society that places collective wellbeing as the top priority through an AI/human hybrid governance model of the type we describe,
A global, totalitarian state that entrenches the interests of a single group or individual by using AI as an all-powerful enforcer, or
Human extinction due to such an AI that is misaligned and turns on its controllers, most likely a successful or aspiring such totalitarian state.
It is highly unlikely that any single individual, no matter how presently advantaged, would be better off, or even likely to survive, in a world with Superintelligence and without established, effective global governance based on pluralistic values. This is the case even assuming they have no issue surrendering their agency or misguidedly think that they individually might be able to rule the world.
We try not to dwell on the existential nature of the impending emergence of Artificial Superintelligence (ASI), as it is an overwhelming reality to face. In framing why the development of Social Environmental Design is so critical and time-sensitive, however, this context is critical. It is impossible to overstate the risks that ASI would pose to a society that has not moved beyond its dysfunctional and combative tendencies. Creating AIs that are both oriented toward dominance of humans and beyond human capabilities to stop is both incredibly dangerous and incredibly likely given current societal power structures and their orientation.
The AI principal policymaker of SED need not be an ASI to be orders of magnitude more effective than our existing approaches to fostering cooperation and value alignment. It is exactly that cooperation and value alignment that will be needed to allow actors around the world to coordinate to prevent the emergence of a misaligned Superintelligence that can threaten the survival of life on earth.
A particularly advantaged member of our society might be convinced that they can fully insulate themselves from the impacts of other cascading, systemic failures such as climate change, and so may not feel that any significant modification to the status quo is needed. Superintelligent AI changes that calculus. Although it is counterintuitive based on past human experience, as AI capacities expand, there are only three likely endstates:
A global, pluralistic society that places collective wellbeing as the top priority through an AI/human hybrid governance model of the type we describe,
A global, totalitarian state that entrenches the interests of a single group or individual by using AI as an all-powerful enforcer, or
Human extinction due to such an AI that is misaligned and turns on its controllers, most likely a successful or aspiring such totalitarian state.
It is highly unlikely that any single individual, no matter how presently advantaged, would be better off, or even likely to survive, in a world with Superintelligence and without established, effective global governance based on pluralistic values. This is the case even assuming they have no issue surrendering their agency or misguidedly think that they individually might be able to rule the world.
B. The Neglected Aspect of AI Safety
B. The Neglected Aspect of AI Safety
B. The Neglected Aspect of AI Safety
AI safety, or the alignment of AIs to humans, is an area of increasing scrutiny for industry and governmental actors. Ensuring that the tool you create can be controlled is fundamental to its function and the safety of its creators and users. Significantly more resources are needed in this direction, but there at least seems to be a consensus that this is important among many of the most powerful players in the field.
Approaches to the problem of AI safety generally center on ensuring that the creators of the AI can:
Interpret what the AI is doing, potentially with the assistance of other, trusted AI systems
Ensure that the AI follows their values and intentions
Interrupt or shut down the AI if needed
These are all very important considerations, and there is a definite need to scale investment in this direction as society leverages AI for more and more tasks within our economic and social systems. What this approach, largely intentionally, neglects is the alignment of whatever entity controls the AI with the global population. Our legacy systems are built on the concept of power aggregation by individuals and groups, and the idea of taking advancements of any sort and placing them outside of this context is not something our systems are designed for.
This gap will be exacerbated by any moves by national governments to accelerate AI development as a military tool. This goes beyond the types of militarization of AI already taking place on battlefields from Ukraine to Gaza. Instead this effort would be to create the type of AI superintelligence that would represent a decisive advantage against all adversaries. Indeed, if it could be adequately controlled, such an AI would guarantee victory against any threat, and the first nation to attain it would become a global hegemon.
Given this, there is a compelling case to be made that national governments, particularly the US and China, will rush toward ASI once it becomes clear that the goal is attainable (potentially within the next ~2 years). Our unmitigated failure at thwarting the militarization of AI thus far, alongside ongoing failures within the global community to address slower moving tragedies of the commons such as climate change, gives little hope that such an arms race can be avoided or even controlled using current cooperative frameworks.
AI safety, or the alignment of AIs to humans, is an area of increasing scrutiny for industry and governmental actors. Ensuring that the tool you create can be controlled is fundamental to its function and the safety of its creators and users. Significantly more resources are needed in this direction, but there at least seems to be a consensus that this is important among many of the most powerful players in the field.
Approaches to the problem of AI safety generally center on ensuring that the creators of the AI can:
Interpret what the AI is doing, potentially with the assistance of other, trusted AI systems
Ensure that the AI follows their values and intentions
Interrupt or shut down the AI if needed
These are all very important considerations, and there is a definite need to scale investment in this direction as society leverages AI for more and more tasks within our economic and social systems. What this approach, largely intentionally, neglects is the alignment of whatever entity controls the AI with the global population. Our legacy systems are built on the concept of power aggregation by individuals and groups, and the idea of taking advancements of any sort and placing them outside of this context is not something our systems are designed for.
This gap will be exacerbated by any moves by national governments to accelerate AI development as a military tool. This goes beyond the types of militarization of AI already taking place on battlefields from Ukraine to Gaza. Instead this effort would be to create the type of AI superintelligence that would represent a decisive advantage against all adversaries. Indeed, if it could be adequately controlled, such an AI would guarantee victory against any threat, and the first nation to attain it would become a global hegemon.
Given this, there is a compelling case to be made that national governments, particularly the US and China, will rush toward ASI once it becomes clear that the goal is attainable (potentially within the next ~2 years). Our unmitigated failure at thwarting the militarization of AI thus far, alongside ongoing failures within the global community to address slower moving tragedies of the commons such as climate change, gives little hope that such an arms race can be avoided or even controlled using current cooperative frameworks.
AI safety, or the alignment of AIs to humans, is an area of increasing scrutiny for industry and governmental actors. Ensuring that the tool you create can be controlled is fundamental to its function and the safety of its creators and users. Significantly more resources are needed in this direction, but there at least seems to be a consensus that this is important among many of the most powerful players in the field.
Approaches to the problem of AI safety generally center on ensuring that the creators of the AI can:
Interpret what the AI is doing, potentially with the assistance of other, trusted AI systems
Ensure that the AI follows their values and intentions
Interrupt or shut down the AI if needed
These are all very important considerations, and there is a definite need to scale investment in this direction as society leverages AI for more and more tasks within our economic and social systems. What this approach, largely intentionally, neglects is the alignment of whatever entity controls the AI with the global population. Our legacy systems are built on the concept of power aggregation by individuals and groups, and the idea of taking advancements of any sort and placing them outside of this context is not something our systems are designed for.
This gap will be exacerbated by any moves by national governments to accelerate AI development as a military tool. This goes beyond the types of militarization of AI already taking place on battlefields from Ukraine to Gaza. Instead this effort would be to create the type of AI superintelligence that would represent a decisive advantage against all adversaries. Indeed, if it could be adequately controlled, such an AI would guarantee victory against any threat, and the first nation to attain it would become a global hegemon.
Given this, there is a compelling case to be made that national governments, particularly the US and China, will rush toward ASI once it becomes clear that the goal is attainable (potentially within the next ~2 years). Our unmitigated failure at thwarting the militarization of AI thus far, alongside ongoing failures within the global community to address slower moving tragedies of the commons such as climate change, gives little hope that such an arms race can be avoided or even controlled using current cooperative frameworks.
C. Tracking Framework Features
C. Tracking Framework Features
C. Tracking Framework Features
Specification
Contributions are stored in an auditable database, and validated with a hash of the contents stored “on-chain” on a distributed ledger such as Solana. This ensures that the contents are consistent from the time of contribution to the point of future audit. Each project has such a “Proof of Contribution,” or PoC, which can feature static percentages of relative contribution, records of contribution activity, or both.
Contributions accrue to projects, which are integrated into the “SED Ecosystem” and contribute to its success. The relative impact of the projects is assessed by the AI retrospectively, easing the process of fair value attribution considerably. Proportionate contribution to individual projects can be similarly assessed based on the data stored in the PoC.
All projects must be open source, allowing the system to be interoperable with other approaches to valuing contributions to public goods retroactively. Founding Labs will build tools to make that tracking easier, which will be freely available to anyone.
Retrospective values are based on the AI’s assessment of relative contribution by individuals to a project, and will be guided by the impact achieved by the project in conjunction with humanity’s values as they pertain to desired impacts and to reward “fairness,” which are natural language statements that the group agrees upon through the voting mechanism (See Reward System Guidelines).
Proof of Concept
Uses Git (via Github) as a single source of truth
All code is kept in Github
External apps are synced with github periodically and automatically
Repo with pointers to all activity acts as a single source of truth as to actions taken
Audio and other large files uses Git Large File Storage with hash of file contents
Full event history periodically backed up externally to Github (Google Drive)
Each backup is validated through a hash of the contents stored in the repo and on a public blockchain
Specification
Contributions are stored in an auditable database, and validated with a hash of the contents stored “on-chain” on a distributed ledger such as Solana. This ensures that the contents are consistent from the time of contribution to the point of future audit. Each project has such a “Proof of Contribution,” or PoC, which can feature static percentages of relative contribution, records of contribution activity, or both.
Contributions accrue to projects, which are integrated into the “SED Ecosystem” and contribute to its success. The relative impact of the projects is assessed by the AI retrospectively, easing the process of fair value attribution considerably. Proportionate contribution to individual projects can be similarly assessed based on the data stored in the PoC.
All projects must be open source, allowing the system to be interoperable with other approaches to valuing contributions to public goods retroactively. Founding Labs will build tools to make that tracking easier, which will be freely available to anyone.
Retrospective values are based on the AI’s assessment of relative contribution by individuals to a project, and will be guided by the impact achieved by the project in conjunction with humanity’s values as they pertain to desired impacts and to reward “fairness,” which are natural language statements that the group agrees upon through the voting mechanism (See Reward System Guidelines).
Proof of Concept
Uses Git (via Github) as a single source of truth
All code is kept in Github
External apps are synced with github periodically and automatically
Repo with pointers to all activity acts as a single source of truth as to actions taken
Audio and other large files uses Git Large File Storage with hash of file contents
Full event history periodically backed up externally to Github (Google Drive)
Each backup is validated through a hash of the contents stored in the repo and on a public blockchain
Specification
Contributions are stored in an auditable database, and validated with a hash of the contents stored “on-chain” on a distributed ledger such as Solana. This ensures that the contents are consistent from the time of contribution to the point of future audit. Each project has such a “Proof of Contribution,” or PoC, which can feature static percentages of relative contribution, records of contribution activity, or both.
Contributions accrue to projects, which are integrated into the “SED Ecosystem” and contribute to its success. The relative impact of the projects is assessed by the AI retrospectively, easing the process of fair value attribution considerably. Proportionate contribution to individual projects can be similarly assessed based on the data stored in the PoC.
All projects must be open source, allowing the system to be interoperable with other approaches to valuing contributions to public goods retroactively. Founding Labs will build tools to make that tracking easier, which will be freely available to anyone.
Retrospective values are based on the AI’s assessment of relative contribution by individuals to a project, and will be guided by the impact achieved by the project in conjunction with humanity’s values as they pertain to desired impacts and to reward “fairness,” which are natural language statements that the group agrees upon through the voting mechanism (See Reward System Guidelines).
Proof of Concept
Uses Git (via Github) as a single source of truth
All code is kept in Github
External apps are synced with github periodically and automatically
Repo with pointers to all activity acts as a single source of truth as to actions taken
Audio and other large files uses Git Large File Storage with hash of file contents
Full event history periodically backed up externally to Github (Google Drive)
Each backup is validated through a hash of the contents stored in the repo and on a public blockchain
D. Reward System Guidelines
D. Reward System Guidelines
D. Reward System Guidelines
(Axiom) Within the bounds of the below, the reward structure should maximize aggregate individual motivation for contribution.
(Axiom) All contributions yield additional reward, however reward is scaled down based upon individual contributors’ total rewards received. This is similar to a type of progressive taxation, with the exact level being optimized and voted on through the SED process.
Contributions made earlier, when risks and uncertainty are higher, are more valuable than contributions made later, when risks and uncertainty are lower.
Contributions that create or allow for greater positive impacts as imputed by the values of the group are more valuable.
Independent of positive impact, efforts made in good faith have value based upon the degree of that effort, but this value is significantly less than a similar effort that is impactful.
The benefits of public goods are calculated and accrue to the projects that enable them.
Rewards are distributed when there is a high degree of certainty that impacts have not been overestimated.
Rewards may be adjusted based on new information or analysis or new imputed values.
The bar is higher for reducing rewards already distributed than for issuing new rewards based on subsequent positive impacts
These guidelines are designed to be updated over time based upon the values of the group as intermediated by the AI agent within SED. As such, they will change over time, but should move in the direction of higher efficacy and fairness as more people contribute their input and values to the system.
(Axiom) Within the bounds of the below, the reward structure should maximize aggregate individual motivation for contribution.
(Axiom) All contributions yield additional reward, however reward is scaled down based upon individual contributors’ total rewards received. This is similar to a type of progressive taxation, with the exact level being optimized and voted on through the SED process.
Contributions made earlier, when risks and uncertainty are higher, are more valuable than contributions made later, when risks and uncertainty are lower.
Contributions that create or allow for greater positive impacts as imputed by the values of the group are more valuable.
Independent of positive impact, efforts made in good faith have value based upon the degree of that effort, but this value is significantly less than a similar effort that is impactful.
The benefits of public goods are calculated and accrue to the projects that enable them.
Rewards are distributed when there is a high degree of certainty that impacts have not been overestimated.
Rewards may be adjusted based on new information or analysis or new imputed values.
The bar is higher for reducing rewards already distributed than for issuing new rewards based on subsequent positive impacts
These guidelines are designed to be updated over time based upon the values of the group as intermediated by the AI agent within SED. As such, they will change over time, but should move in the direction of higher efficacy and fairness as more people contribute their input and values to the system.
(Axiom) Within the bounds of the below, the reward structure should maximize aggregate individual motivation for contribution.
(Axiom) All contributions yield additional reward, however reward is scaled down based upon individual contributors’ total rewards received. This is similar to a type of progressive taxation, with the exact level being optimized and voted on through the SED process.
Contributions made earlier, when risks and uncertainty are higher, are more valuable than contributions made later, when risks and uncertainty are lower.
Contributions that create or allow for greater positive impacts as imputed by the values of the group are more valuable.
Independent of positive impact, efforts made in good faith have value based upon the degree of that effort, but this value is significantly less than a similar effort that is impactful.
The benefits of public goods are calculated and accrue to the projects that enable them.
Rewards are distributed when there is a high degree of certainty that impacts have not been overestimated.
Rewards may be adjusted based on new information or analysis or new imputed values.
The bar is higher for reducing rewards already distributed than for issuing new rewards based on subsequent positive impacts
These guidelines are designed to be updated over time based upon the values of the group as intermediated by the AI agent within SED. As such, they will change over time, but should move in the direction of higher efficacy and fairness as more people contribute their input and values to the system.
E. The World's First Automated Policymaker
E. The World's First Automated Policymaker
E. The World's First Automated Policymaker
Social Environment Design is a general framework for the use of AI for automated policy-making that connects with the Reinforcement Learning, EconCS, and Computational Social Choice communities. The framework seeks to capture general economic environments, includes voting on policy objectives, and gives a direction for the systematic analysis of government and economic policy through AI simulation. The key elements of SED are as follows:
Voting on preferences by a group of agents – either humans or simulations of humans creates an aggregated objective, which can be multifaceted.
Preferences are passed to an "inner loop" consisting of an AI “Principal Agent” which instantiates a simulated world in which the other agents will interact and includes changes to the environment and incentives for the agents
Agents interact in the simulated world for a set period of time, or one “round” within the inner loop.
Based upon the success of the instantiated world in optimizing for the aggregated objective, the Principal agent modifies the incentive structure and/or simulated worldstate (bounded by a degree of change) and the "inner loop" repeats.
As the simulations begins to converge on the aggregated objective, it passes back to the outer loop for another round of voting/values elicitation. Based on the agents’ outcomes and experiences they update their preferences through the voting mechanism and this is passed back to the Principal Agent, restarting the cycle.
In this way, the AI Principal Agent helps to steer the human participants toward the outcomes they wish to attain, without them having to know how best to achieve those outcomes, or even what those outcomes are, at the outset. This is done in simulation, reducing risk and allowing for the massive parallelization of the process through human simulacra. Humans steer the process throughout, with repeated opportunities to redirect the process if it begins to manifest undesired outcomes.
No development on collective decision making matters if we can’t use it to avoid an AI arms race and potential hot conflict between the US and China. The purpose of the work Humanity Unleashed is doing is to ensure that humanity can progress toward AI Superintelligence with minimal risk and maximal potential for upside, and such a conflict would massively decrease the likelihood that such a transition can be effected safely.
Using AI technology that is not on the leading edge (and so does not increase the risk of misaligned AI) to facilitate more functional collaboration between nations generally, and on AI safety in particular, is within scope for Social Environment Design.
Social Environment Design is a general framework for the use of AI for automated policy-making that connects with the Reinforcement Learning, EconCS, and Computational Social Choice communities. The framework seeks to capture general economic environments, includes voting on policy objectives, and gives a direction for the systematic analysis of government and economic policy through AI simulation. The key elements of SED are as follows:
Voting on preferences by a group of agents – either humans or simulations of humans creates an aggregated objective, which can be multifaceted.
Preferences are passed to an "inner loop" consisting of an AI “Principal Agent” which instantiates a simulated world in which the other agents will interact and includes changes to the environment and incentives for the agents
Agents interact in the simulated world for a set period of time, or one “round” within the inner loop.
Based upon the success of the instantiated world in optimizing for the aggregated objective, the Principal agent modifies the incentive structure and/or simulated worldstate (bounded by a degree of change) and the "inner loop" repeats.
As the simulations begins to converge on the aggregated objective, it passes back to the outer loop for another round of voting/values elicitation. Based on the agents’ outcomes and experiences they update their preferences through the voting mechanism and this is passed back to the Principal Agent, restarting the cycle.
In this way, the AI Principal Agent helps to steer the human participants toward the outcomes they wish to attain, without them having to know how best to achieve those outcomes, or even what those outcomes are, at the outset. This is done in simulation, reducing risk and allowing for the massive parallelization of the process through human simulacra. Humans steer the process throughout, with repeated opportunities to redirect the process if it begins to manifest undesired outcomes.
No development on collective decision making matters if we can’t use it to avoid an AI arms race and potential hot conflict between the US and China. The purpose of the work Humanity Unleashed is doing is to ensure that humanity can progress toward AI Superintelligence with minimal risk and maximal potential for upside, and such a conflict would massively decrease the likelihood that such a transition can be effected safely.
Using AI technology that is not on the leading edge (and so does not increase the risk of misaligned AI) to facilitate more functional collaboration between nations generally, and on AI safety in particular, is within scope for Social Environment Design.
Social Environment Design is a general framework for the use of AI for automated policy-making that connects with the Reinforcement Learning, EconCS, and Computational Social Choice communities. The framework seeks to capture general economic environments, includes voting on policy objectives, and gives a direction for the systematic analysis of government and economic policy through AI simulation. The key elements of SED are as follows:
Voting on preferences by a group of agents – either humans or simulations of humans creates an aggregated objective, which can be multifaceted.
Preferences are passed to an "inner loop" consisting of an AI “Principal Agent” which instantiates a simulated world in which the other agents will interact and includes changes to the environment and incentives for the agents
Agents interact in the simulated world for a set period of time, or one “round” within the inner loop.
Based upon the success of the instantiated world in optimizing for the aggregated objective, the Principal agent modifies the incentive structure and/or simulated worldstate (bounded by a degree of change) and the "inner loop" repeats.
As the simulations begins to converge on the aggregated objective, it passes back to the outer loop for another round of voting/values elicitation. Based on the agents’ outcomes and experiences they update their preferences through the voting mechanism and this is passed back to the Principal Agent, restarting the cycle.
In this way, the AI Principal Agent helps to steer the human participants toward the outcomes they wish to attain, without them having to know how best to achieve those outcomes, or even what those outcomes are, at the outset. This is done in simulation, reducing risk and allowing for the massive parallelization of the process through human simulacra. Humans steer the process throughout, with repeated opportunities to redirect the process if it begins to manifest undesired outcomes.
No development on collective decision making matters if we can’t use it to avoid an AI arms race and potential hot conflict between the US and China. The purpose of the work Humanity Unleashed is doing is to ensure that humanity can progress toward AI Superintelligence with minimal risk and maximal potential for upside, and such a conflict would massively decrease the likelihood that such a transition can be effected safely.
Using AI technology that is not on the leading edge (and so does not increase the risk of misaligned AI) to facilitate more functional collaboration between nations generally, and on AI safety in particular, is within scope for Social Environment Design.