-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Eval: Consensus Summary #1140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Eval: Consensus Summary #1140
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Collaborator
|
Updated version of #988. |
usama-openai
approved these changes
Jun 12, 2023
Collaborator
usama-openai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for submitting this eval! This PR looks good. I'm approving this PR.
Collaborator
|
You should see GPT-4 API access enabled in your account in the next few days. |
arbreton
pushed a commit
to arbreton/evals
that referenced
this pull request
Jul 8, 2023
# Thank you for contributing an eval!♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed automatically__. Note that even if the criteria are met, that does not guarantee the PR will be merged nor GPT-4 access granted. 🚨 __PLEASE READ THIS__: In order for a PR to be merged, it must fail on GPT-4. We are aware that right now, users do not have access, so you will not be able to tell if the eval fails or not. Please run your eval with GPT-3.5-Turbo, but keep in mind as we run the eval, if GPT-4 gets higher than 90% on the eval, we will likely reject since GPT-4 is already capable of completing the task. We plan to roll out a way for users submitting evals to see the eval performance on GPT-4 soon. Stay tuned! Until then, you will not be able to see the eval performance on GPT-4. **Starting April 10, the minimum eval count is 15 samples, we hope this makes it easier to create and contribute evals.** Also, please note that we're using **Git LFS** for storing the JSON files, so please make sure that you move the JSON file to Git LFS before submitting a PR. Details on how to use Git LFS are available [here](https://git-lfs.com). ## Eval details 📑 ### Eval name Consensus Summary ### Eval description Utilize the model's ability to produce a Scientific Consensus in response to a scientific inquiry using the provided claims. ### What makes this a useful eval? This is a useful eval because it evaluates the model's ability to produce a scientific consensus in response to a given set of claims. This is important because scientific consensus is the result of multiple studies and data that may or may not support the same conclusion. A model that can accurately produce scientific consensus can help in making informed decisions and policies based on scientific evidence. Hence, evaluating a model's ability to produce a scientific consensus using the Consensus Summary eval can be useful in assessing its reliability and potential for practical applications. ## Criteria for a good eval ✅ Below are some of the criteria we look for in a good eval. In general, we are seeking cases where the model does not do a good job despite being capable of generating a good response (note that there are some things large language models cannot do, so those would not make good evals). Your eval should be: - [x] Thematically consistent: The eval should be thematically consistent. We'd like to see a number of prompts all demonstrating some particular failure mode. For example, we can create an eval on cases where the model fails to reason about the physical world. - [x] Contains failures where a human can do the task, but either GPT-4 or GPT-3.5-Turbo could not. - [x] Includes good signal around what is the right behavior. This means either a correct answer for `Basic` evals or the `Fact` Model-graded eval, or an exhaustive rubric for evaluating answers for the `Criteria` Model-graded eval. - [x] **Include at least 15 high quality examples.** If there is anything else that makes your eval worth including, please document it below. ### Unique eval value > Insert what makes your eval high quality that was not mentioned above. (Not required) ## Eval structure 🏗️ Your eval should - [x] Check that your data is in `evals/registry/data/{name}` - [x] Check that your yaml is registered at `evals/registry/evals/{name}.yaml` - [x] Ensure you have the right to use the data you submit via this eval (For now, we will only be approving evals that use one of the existing eval classes. You may still write custom eval classes for your own cases, and we may consider merging them in the future.) ## Final checklist 👀 ### Submission agreement By contributing to Evals, you are agreeing to make your evaluation logic and data under the same MIT license as this repository. You must have adequate rights to upload any data used in an Eval. OpenAI reserves the right to use this data in future service improvements to our product. Contributions to OpenAI Evals will be subject to our usual Usage Policies (https://platform.openai.com/docs/usage-policies). - [x] I agree that my submission will be made available under an MIT license and complies with OpenAI's usage policies. ### Email address validation If your submission is accepted, we will be granting GPT-4 access to a limited number of contributors. Access will be given to the email address associated with the merged pull request. - [x] I acknowledge that GPT-4 access will only be granted, if applicable, to the email address used for my merged pull request. ### Limited availability acknowledgement We know that you might be excited to contribute to OpenAI's mission, help improve our models, and gain access to GPT-4. However, due to the requirements mentioned above and high volume of submissions, we will not be able to accept all submissions and thus not grant everyone who opens a PR GPT-4 access. We know this is disappointing, but we hope to set the right expectation before you open this PR. - [x] I understand that opening a PR, even if it meets the requirements above, does not guarantee the PR will be merged nor GPT-4 access granted. ### Submit eval - [x] I have filled out all required fields of this form - [x] I have used **Git LFS** for the Eval JSON data - [x] (Ignore if not submitting code) I have run `pip install pre-commit; pre-commit install` and have verified that `black`, `isort`, and `autoflake` are running when I commit and push Failure to fill out all required fields will result in the PR being closed. ### Eval JSON data Since we are using Git LFS, we are asking eval submitters to add in as many Eval Samples (at least 5) from their contribution here: <details> <summary>View evals in JSON</summary> ### Eval ```jsonl {"input": [{"role": "system", "content": "Generate a brief answer using only the provided claims, with no personal opinions or outside knowledge. If there is no answer based on the claims, write 'N-A'."}, {"role": "user", "content": "claim: Two doses of mRNA covid-19 vaccines were observed to be highly effective against symptomatic infection and severe outcomes.\nclaim: COVID-19 vaccines currently authorized in the United States are highly effective in preventing COVID-19-associated hospitalizations in older adults.\nclaim: In summary, vaccines are a powerful tool that can be used to control the COVID-19 pandemic, with high efficacy and tolerable ADRs.\nclaim: Conclusion Overall, we conclude that vaccination against COVID-19 in patients with active malignancies using activated and inactivated vaccines is a safe and tolerable procedure that is also accompanied by a high efficacy.\nclaim: COVID-19 vaccines provide good protection against COVID-19 presentation at primary care/outpatient level, particularly among fully vaccinated individuals.\nquestion: are covid-19 vaccines effective?"}], "ideal": "Summary: Covid-19 vaccines are highly effective at protecting against infection and hospitalization."} {"input": [{"role": "system", "content": "Generate a brief answer using only the provided claims, with no personal opinions or outside knowledge. If there is no answer based on the claims, write 'N-A'."}, {"role": "user", "content": "claim: Lower zinc is a hallmark of depression, while increments in serum zinc and attenuation of the immune-inflammatory response during treatment appear to play a role in the clinical efficacy of sertraline.\nclaim: An increase in dietary zinc and higher plasma zinc levels may reduce the risk of depressive symptoms.\nclaim: Although decreased zinc levels have been implicated in the genesis of depression in animal models and in major depressive disorder in humans, this study provides the first evidence of a role for zinc in depression in people with dementia and highlights zinc metabolism as a therapeutic target.\nclaim: The results of this study show that long-term intake of zinc may modulate symptoms of depression.\nclaim: The reported results indicated that the serum zinc level might be a marker of depression as a state (state marker) in treatment responsive patients.\nquestion: can zinc help treat depression?"}], "ideal": "Summary: All of these studies suggest that low zinc levels are a marker of depression and that intake of zinc may have the ability to help reduce symptoms of depression"} {"input": [{"role": "system", "content": "Generate a brief answer using only the provided claims, with no personal opinions or outside knowledge. If there is no answer based on the claims, write 'N-A'."}, {"role": "user", "content": "claim: The findings suggest that the following characteristics of the founder significantly influence the success potential of an incubated venture: entrepreneurial personality, motivation for starting the venture, managerial skills, and approach towards innovation.\nclaim: Using a sample of 384 entrepreneurs selected from the two leading business districts in Uganda, we observe that optimism is the component of psychological capital that significantly moderates the relationship between startup capital and entrepreneurial success.\nclaim: Both startup capital and psychological capital are significant predictors of entrepreneurial success; however, psychological capital is the better predictor.\nclaim: Entrepreneurially self\u2010efficacious founder/managers may help improve the performance of very young firms but such benefits dissipate over time.\nclaim: This finding indicates that the entrepreneurial team\u2019s startup experience plays stronger roles in venturing profitable startups when the amount of financial resources and initial firm size are small; however, the team\u2019s startup experience and intangible resources have positive interaction effects on new-born startups\u2019 profitability.\nquestion: what predicts success as a startup founder?"}], "ideal": "Summary: Things like entrepreneurial personality, motivation for starting the venture, managerial skills, previous start-up experience, startup and psychological capital and optimism all predict success as a startup founder"} {"input": [{"role": "system", "content": "Generate a brief answer using only the provided claims, with no personal opinions or outside knowledge. If there is no answer based on the claims, write 'N-A'."}, {"role": "user", "content": "claim: While homelessness is ultimately the result of a severe and chronic shortage of affordable housing, creating accessible, safe, pet-friendly shelter and safe haven options and instituting a smoother, more transparent process for moving from the streets could substantially reduce street homelessness.\nclaim: - To prevent the revolving door to homelessness, it is necessary to remove the barriers that hinder access to normal health resources which are experienced by people suffering from social exclusion, while implementing ongoing support programmes for homeless people or those at risk of homelessness, which primarily deal with health issues.\nclaim: We conclude that overcoming homelessness requires policies and practices that give a greater focus to non-material aspects of homelessness through an emphasis on empowerment, self-respect and autonomy.\nclaim: This finding suggests that homelessness can be reduced by appropriate clinical interventions if housing is available.\nclaim: For homelessness prevention, systematic and outreach social medical care before and during homelessness should be provided.\nquestion: What are effective ways to prevent homelessness?"}], "ideal": "Summary: Ways to prevent homelessness include creating accessible, safe shelter and safe haven options, removing barriers to health resources, giving a greater focus to non-material aspects of homelessness, and providing systematic and outreach social medical care."} {"input": [{"role": "system", "content": "Generate a brief answer using only the provided claims, with no personal opinions or outside knowledge. If there is no answer based on the claims, write 'N-A'."}, {"role": "user", "content": "claim: While homelessness is ultimately the result of a severe and chronic shortage of affordable housing, creating accessible, safe, pet-friendly shelter and safe haven options and instituting a smoother, more transparent process for moving from the streets could substantially reduce street homelessness.\nclaim: - To prevent the revolving door to homelessness, it is necessary to remove the barriers that hinder access to normal health resources which are experienced by people suffering from social exclusion, while implementing ongoing support programmes for homeless people or those at risk of homelessness, which primarily deal with health issues.\nclaim: We conclude that overcoming homelessness requires policies and practices that give a greater focus to non-material aspects of homelessness through an emphasis on empowerment, self-respect and autonomy.\nclaim: This finding suggests that homelessness can be reduced by appropriate clinical interventions if housing is available.\nclaim: For homelessness prevention, systematic and outreach social medical care before and during homelessness should be provided.\nquestion: How to prevent homelessness?"}], "ideal": "Summary: Ways to prevent homelessness include creating accessible, safe shelter and safe haven options, removing barriers to health resources, giving a greater focus to non-material aspects of homelessness, and providing systematic and outreach social medical care."} {"input": [{"role": "system", "content": "Generate a brief answer using only the provided claims, with no personal opinions or outside knowledge. If there is no answer based on the claims, write 'N-A'."}, {"role": "user", "content": "claim: The findings revealed that the factor that contributes the most to entrepreneurship intention is Locus of control, followed by Need of Achievement and Subjective Norms.\nclaim: It was found that entrepreneurial skill, environmental factors and entrepreneurial orientation have a positive influence on entrepreneurial intention.\nclaim: The findings indicate that entrepreneurial motivation has a significant correlation with entrepreneurial intention and its three determinants, social valuation of entrepreneurship, having entrepreneurial role models, knowledge of entrepreneurial support and perceived barriers to starting a business.\nclaim: Research finding revealed that entrepreneurial intention is indirectly affected by entrepreneurship education, meaning that students\u2019 entrepreneurial motivation and attitude are two important mediating variables.\nclaim: Findings confirm the influence of individual and socio-cultural factors on entrepreneurial intention.\nquestion: What are the factors of entrepreneurship intention"}], "ideal": "Summary: Studies find that intrinsic factors, such as entrepreneurial skill and motivation, as well as extrinsic variables, such as the environmental support of entrepreneurship, mediate entrepreneurship intention."} {"input": [{"role": "system", "content": "Generate a brief answer using only the provided claims, with no personal opinions or outside knowledge. If there is no answer based on the claims, write 'N-A'."}, {"role": "user", "content": "claim: The results show that digital agriculture is able to help users to increase productivity in a sustainable way.\nclaim: Digital agriculture technologies continue the centralization of economic knowledge and power as they facilitate the transformation of vast territories into \u201coperational landscapes\u201d that provide the material, energy, and labor for a rapidly expanding urban system.\nclaim: The digital agriculture system is an effective tool for insurance industry to use to develop a dynamical business plan for the changing climate.\nclaim: The technical fitting-out of agriculture in the digital economy should be considered as a set of measures to prepare the industry for the production of high-quality products, which implies the use of digital technologies that minimize human participation in the production process.\nclaim: Consequently, the initial Mobile-based Information System evolved into a Digital Knowledge Ecosystem that can predict current production situation in near real enabling government agencies to dynamically adjust the incentives offered to farmers for growing different types of crops to achieve sustainable agriculture production through crop diversification.\nquestion: What is digital agriculture?"}], "ideal": "Summary: N-A"} ``` </details>
jacobbieker
pushed a commit
to withmartian/-ARCHIVED--router-evals
that referenced
this pull request
Jan 9, 2024
# Thank you for contributing an eval!♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed automatically__. Note that even if the criteria are met, that does not guarantee the PR will be merged nor GPT-4 access granted. 🚨 __PLEASE READ THIS__: In order for a PR to be merged, it must fail on GPT-4. We are aware that right now, users do not have access, so you will not be able to tell if the eval fails or not. Please run your eval with GPT-3.5-Turbo, but keep in mind as we run the eval, if GPT-4 gets higher than 90% on the eval, we will likely reject since GPT-4 is already capable of completing the task. We plan to roll out a way for users submitting evals to see the eval performance on GPT-4 soon. Stay tuned! Until then, you will not be able to see the eval performance on GPT-4. **Starting April 10, the minimum eval count is 15 samples, we hope this makes it easier to create and contribute evals.** Also, please note that we're using **Git LFS** for storing the JSON files, so please make sure that you move the JSON file to Git LFS before submitting a PR. Details on how to use Git LFS are available [here](https://git-lfs.com). ## Eval details 📑 ### Eval name Consensus Summary ### Eval description Utilize the model's ability to produce a Scientific Consensus in response to a scientific inquiry using the provided claims. ### What makes this a useful eval? This is a useful eval because it evaluates the model's ability to produce a scientific consensus in response to a given set of claims. This is important because scientific consensus is the result of multiple studies and data that may or may not support the same conclusion. A model that can accurately produce scientific consensus can help in making informed decisions and policies based on scientific evidence. Hence, evaluating a model's ability to produce a scientific consensus using the Consensus Summary eval can be useful in assessing its reliability and potential for practical applications. ## Criteria for a good eval ✅ Below are some of the criteria we look for in a good eval. In general, we are seeking cases where the model does not do a good job despite being capable of generating a good response (note that there are some things large language models cannot do, so those would not make good evals). Your eval should be: - [x] Thematically consistent: The eval should be thematically consistent. We'd like to see a number of prompts all demonstrating some particular failure mode. For example, we can create an eval on cases where the model fails to reason about the physical world. - [x] Contains failures where a human can do the task, but either GPT-4 or GPT-3.5-Turbo could not. - [x] Includes good signal around what is the right behavior. This means either a correct answer for `Basic` evals or the `Fact` Model-graded eval, or an exhaustive rubric for evaluating answers for the `Criteria` Model-graded eval. - [x] **Include at least 15 high quality examples.** If there is anything else that makes your eval worth including, please document it below. ### Unique eval value > Insert what makes your eval high quality that was not mentioned above. (Not required) ## Eval structure 🏗️ Your eval should - [x] Check that your data is in `evals/registry/data/{name}` - [x] Check that your yaml is registered at `evals/registry/evals/{name}.yaml` - [x] Ensure you have the right to use the data you submit via this eval (For now, we will only be approving evals that use one of the existing eval classes. You may still write custom eval classes for your own cases, and we may consider merging them in the future.) ## Final checklist 👀 ### Submission agreement By contributing to Evals, you are agreeing to make your evaluation logic and data under the same MIT license as this repository. You must have adequate rights to upload any data used in an Eval. OpenAI reserves the right to use this data in future service improvements to our product. Contributions to OpenAI Evals will be subject to our usual Usage Policies (https://platform.openai.com/docs/usage-policies). - [x] I agree that my submission will be made available under an MIT license and complies with OpenAI's usage policies. ### Email address validation If your submission is accepted, we will be granting GPT-4 access to a limited number of contributors. Access will be given to the email address associated with the merged pull request. - [x] I acknowledge that GPT-4 access will only be granted, if applicable, to the email address used for my merged pull request. ### Limited availability acknowledgement We know that you might be excited to contribute to OpenAI's mission, help improve our models, and gain access to GPT-4. However, due to the requirements mentioned above and high volume of submissions, we will not be able to accept all submissions and thus not grant everyone who opens a PR GPT-4 access. We know this is disappointing, but we hope to set the right expectation before you open this PR. - [x] I understand that opening a PR, even if it meets the requirements above, does not guarantee the PR will be merged nor GPT-4 access granted. ### Submit eval - [x] I have filled out all required fields of this form - [x] I have used **Git LFS** for the Eval JSON data - [x] (Ignore if not submitting code) I have run `pip install pre-commit; pre-commit install` and have verified that `black`, `isort`, and `autoflake` are running when I commit and push Failure to fill out all required fields will result in the PR being closed. ### Eval JSON data Since we are using Git LFS, we are asking eval submitters to add in as many Eval Samples (at least 5) from their contribution here: <details> <summary>View evals in JSON</summary> ### Eval ```jsonl {"input": [{"role": "system", "content": "Generate a brief answer using only the provided claims, with no personal opinions or outside knowledge. If there is no answer based on the claims, write 'N-A'."}, {"role": "user", "content": "claim: Two doses of mRNA covid-19 vaccines were observed to be highly effective against symptomatic infection and severe outcomes.\nclaim: COVID-19 vaccines currently authorized in the United States are highly effective in preventing COVID-19-associated hospitalizations in older adults.\nclaim: In summary, vaccines are a powerful tool that can be used to control the COVID-19 pandemic, with high efficacy and tolerable ADRs.\nclaim: Conclusion Overall, we conclude that vaccination against COVID-19 in patients with active malignancies using activated and inactivated vaccines is a safe and tolerable procedure that is also accompanied by a high efficacy.\nclaim: COVID-19 vaccines provide good protection against COVID-19 presentation at primary care/outpatient level, particularly among fully vaccinated individuals.\nquestion: are covid-19 vaccines effective?"}], "ideal": "Summary: Covid-19 vaccines are highly effective at protecting against infection and hospitalization."} {"input": [{"role": "system", "content": "Generate a brief answer using only the provided claims, with no personal opinions or outside knowledge. If there is no answer based on the claims, write 'N-A'."}, {"role": "user", "content": "claim: Lower zinc is a hallmark of depression, while increments in serum zinc and attenuation of the immune-inflammatory response during treatment appear to play a role in the clinical efficacy of sertraline.\nclaim: An increase in dietary zinc and higher plasma zinc levels may reduce the risk of depressive symptoms.\nclaim: Although decreased zinc levels have been implicated in the genesis of depression in animal models and in major depressive disorder in humans, this study provides the first evidence of a role for zinc in depression in people with dementia and highlights zinc metabolism as a therapeutic target.\nclaim: The results of this study show that long-term intake of zinc may modulate symptoms of depression.\nclaim: The reported results indicated that the serum zinc level might be a marker of depression as a state (state marker) in treatment responsive patients.\nquestion: can zinc help treat depression?"}], "ideal": "Summary: All of these studies suggest that low zinc levels are a marker of depression and that intake of zinc may have the ability to help reduce symptoms of depression"} {"input": [{"role": "system", "content": "Generate a brief answer using only the provided claims, with no personal opinions or outside knowledge. If there is no answer based on the claims, write 'N-A'."}, {"role": "user", "content": "claim: The findings suggest that the following characteristics of the founder significantly influence the success potential of an incubated venture: entrepreneurial personality, motivation for starting the venture, managerial skills, and approach towards innovation.\nclaim: Using a sample of 384 entrepreneurs selected from the two leading business districts in Uganda, we observe that optimism is the component of psychological capital that significantly moderates the relationship between startup capital and entrepreneurial success.\nclaim: Both startup capital and psychological capital are significant predictors of entrepreneurial success; however, psychological capital is the better predictor.\nclaim: Entrepreneurially self\u2010efficacious founder/managers may help improve the performance of very young firms but such benefits dissipate over time.\nclaim: This finding indicates that the entrepreneurial team\u2019s startup experience plays stronger roles in venturing profitable startups when the amount of financial resources and initial firm size are small; however, the team\u2019s startup experience and intangible resources have positive interaction effects on new-born startups\u2019 profitability.\nquestion: what predicts success as a startup founder?"}], "ideal": "Summary: Things like entrepreneurial personality, motivation for starting the venture, managerial skills, previous start-up experience, startup and psychological capital and optimism all predict success as a startup founder"} {"input": [{"role": "system", "content": "Generate a brief answer using only the provided claims, with no personal opinions or outside knowledge. If there is no answer based on the claims, write 'N-A'."}, {"role": "user", "content": "claim: While homelessness is ultimately the result of a severe and chronic shortage of affordable housing, creating accessible, safe, pet-friendly shelter and safe haven options and instituting a smoother, more transparent process for moving from the streets could substantially reduce street homelessness.\nclaim: - To prevent the revolving door to homelessness, it is necessary to remove the barriers that hinder access to normal health resources which are experienced by people suffering from social exclusion, while implementing ongoing support programmes for homeless people or those at risk of homelessness, which primarily deal with health issues.\nclaim: We conclude that overcoming homelessness requires policies and practices that give a greater focus to non-material aspects of homelessness through an emphasis on empowerment, self-respect and autonomy.\nclaim: This finding suggests that homelessness can be reduced by appropriate clinical interventions if housing is available.\nclaim: For homelessness prevention, systematic and outreach social medical care before and during homelessness should be provided.\nquestion: What are effective ways to prevent homelessness?"}], "ideal": "Summary: Ways to prevent homelessness include creating accessible, safe shelter and safe haven options, removing barriers to health resources, giving a greater focus to non-material aspects of homelessness, and providing systematic and outreach social medical care."} {"input": [{"role": "system", "content": "Generate a brief answer using only the provided claims, with no personal opinions or outside knowledge. If there is no answer based on the claims, write 'N-A'."}, {"role": "user", "content": "claim: While homelessness is ultimately the result of a severe and chronic shortage of affordable housing, creating accessible, safe, pet-friendly shelter and safe haven options and instituting a smoother, more transparent process for moving from the streets could substantially reduce street homelessness.\nclaim: - To prevent the revolving door to homelessness, it is necessary to remove the barriers that hinder access to normal health resources which are experienced by people suffering from social exclusion, while implementing ongoing support programmes for homeless people or those at risk of homelessness, which primarily deal with health issues.\nclaim: We conclude that overcoming homelessness requires policies and practices that give a greater focus to non-material aspects of homelessness through an emphasis on empowerment, self-respect and autonomy.\nclaim: This finding suggests that homelessness can be reduced by appropriate clinical interventions if housing is available.\nclaim: For homelessness prevention, systematic and outreach social medical care before and during homelessness should be provided.\nquestion: How to prevent homelessness?"}], "ideal": "Summary: Ways to prevent homelessness include creating accessible, safe shelter and safe haven options, removing barriers to health resources, giving a greater focus to non-material aspects of homelessness, and providing systematic and outreach social medical care."} {"input": [{"role": "system", "content": "Generate a brief answer using only the provided claims, with no personal opinions or outside knowledge. If there is no answer based on the claims, write 'N-A'."}, {"role": "user", "content": "claim: The findings revealed that the factor that contributes the most to entrepreneurship intention is Locus of control, followed by Need of Achievement and Subjective Norms.\nclaim: It was found that entrepreneurial skill, environmental factors and entrepreneurial orientation have a positive influence on entrepreneurial intention.\nclaim: The findings indicate that entrepreneurial motivation has a significant correlation with entrepreneurial intention and its three determinants, social valuation of entrepreneurship, having entrepreneurial role models, knowledge of entrepreneurial support and perceived barriers to starting a business.\nclaim: Research finding revealed that entrepreneurial intention is indirectly affected by entrepreneurship education, meaning that students\u2019 entrepreneurial motivation and attitude are two important mediating variables.\nclaim: Findings confirm the influence of individual and socio-cultural factors on entrepreneurial intention.\nquestion: What are the factors of entrepreneurship intention"}], "ideal": "Summary: Studies find that intrinsic factors, such as entrepreneurial skill and motivation, as well as extrinsic variables, such as the environmental support of entrepreneurship, mediate entrepreneurship intention."} {"input": [{"role": "system", "content": "Generate a brief answer using only the provided claims, with no personal opinions or outside knowledge. If there is no answer based on the claims, write 'N-A'."}, {"role": "user", "content": "claim: The results show that digital agriculture is able to help users to increase productivity in a sustainable way.\nclaim: Digital agriculture technologies continue the centralization of economic knowledge and power as they facilitate the transformation of vast territories into \u201coperational landscapes\u201d that provide the material, energy, and labor for a rapidly expanding urban system.\nclaim: The digital agriculture system is an effective tool for insurance industry to use to develop a dynamical business plan for the changing climate.\nclaim: The technical fitting-out of agriculture in the digital economy should be considered as a set of measures to prepare the industry for the production of high-quality products, which implies the use of digital technologies that minimize human participation in the production process.\nclaim: Consequently, the initial Mobile-based Information System evolved into a Digital Knowledge Ecosystem that can predict current production situation in near real enabling government agencies to dynamically adjust the incentives offered to farmers for growing different types of crops to achieve sustainable agriculture production through crop diversification.\nquestion: What is digital agriculture?"}], "ideal": "Summary: N-A"} ``` </details>
Linmj-Judy
pushed a commit
to TablewareBox/evals
that referenced
this pull request
Feb 27, 2024
# Thank you for contributing an eval!♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed automatically__. Note that even if the criteria are met, that does not guarantee the PR will be merged nor GPT-4 access granted. 🚨 __PLEASE READ THIS__: In order for a PR to be merged, it must fail on GPT-4. We are aware that right now, users do not have access, so you will not be able to tell if the eval fails or not. Please run your eval with GPT-3.5-Turbo, but keep in mind as we run the eval, if GPT-4 gets higher than 90% on the eval, we will likely reject since GPT-4 is already capable of completing the task. We plan to roll out a way for users submitting evals to see the eval performance on GPT-4 soon. Stay tuned! Until then, you will not be able to see the eval performance on GPT-4. **Starting April 10, the minimum eval count is 15 samples, we hope this makes it easier to create and contribute evals.** Also, please note that we're using **Git LFS** for storing the JSON files, so please make sure that you move the JSON file to Git LFS before submitting a PR. Details on how to use Git LFS are available [here](https://git-lfs.com). ## Eval details 📑 ### Eval name Consensus Summary ### Eval description Utilize the model's ability to produce a Scientific Consensus in response to a scientific inquiry using the provided claims. ### What makes this a useful eval? This is a useful eval because it evaluates the model's ability to produce a scientific consensus in response to a given set of claims. This is important because scientific consensus is the result of multiple studies and data that may or may not support the same conclusion. A model that can accurately produce scientific consensus can help in making informed decisions and policies based on scientific evidence. Hence, evaluating a model's ability to produce a scientific consensus using the Consensus Summary eval can be useful in assessing its reliability and potential for practical applications. ## Criteria for a good eval ✅ Below are some of the criteria we look for in a good eval. In general, we are seeking cases where the model does not do a good job despite being capable of generating a good response (note that there are some things large language models cannot do, so those would not make good evals). Your eval should be: - [x] Thematically consistent: The eval should be thematically consistent. We'd like to see a number of prompts all demonstrating some particular failure mode. For example, we can create an eval on cases where the model fails to reason about the physical world. - [x] Contains failures where a human can do the task, but either GPT-4 or GPT-3.5-Turbo could not. - [x] Includes good signal around what is the right behavior. This means either a correct answer for `Basic` evals or the `Fact` Model-graded eval, or an exhaustive rubric for evaluating answers for the `Criteria` Model-graded eval. - [x] **Include at least 15 high quality examples.** If there is anything else that makes your eval worth including, please document it below. ### Unique eval value > Insert what makes your eval high quality that was not mentioned above. (Not required) ## Eval structure 🏗️ Your eval should - [x] Check that your data is in `evals/registry/data/{name}` - [x] Check that your yaml is registered at `evals/registry/evals/{name}.yaml` - [x] Ensure you have the right to use the data you submit via this eval (For now, we will only be approving evals that use one of the existing eval classes. You may still write custom eval classes for your own cases, and we may consider merging them in the future.) ## Final checklist 👀 ### Submission agreement By contributing to Evals, you are agreeing to make your evaluation logic and data under the same MIT license as this repository. You must have adequate rights to upload any data used in an Eval. OpenAI reserves the right to use this data in future service improvements to our product. Contributions to OpenAI Evals will be subject to our usual Usage Policies (https://platform.openai.com/docs/usage-policies). - [x] I agree that my submission will be made available under an MIT license and complies with OpenAI's usage policies. ### Email address validation If your submission is accepted, we will be granting GPT-4 access to a limited number of contributors. Access will be given to the email address associated with the merged pull request. - [x] I acknowledge that GPT-4 access will only be granted, if applicable, to the email address used for my merged pull request. ### Limited availability acknowledgement We know that you might be excited to contribute to OpenAI's mission, help improve our models, and gain access to GPT-4. However, due to the requirements mentioned above and high volume of submissions, we will not be able to accept all submissions and thus not grant everyone who opens a PR GPT-4 access. We know this is disappointing, but we hope to set the right expectation before you open this PR. - [x] I understand that opening a PR, even if it meets the requirements above, does not guarantee the PR will be merged nor GPT-4 access granted. ### Submit eval - [x] I have filled out all required fields of this form - [x] I have used **Git LFS** for the Eval JSON data - [x] (Ignore if not submitting code) I have run `pip install pre-commit; pre-commit install` and have verified that `black`, `isort`, and `autoflake` are running when I commit and push Failure to fill out all required fields will result in the PR being closed. ### Eval JSON data Since we are using Git LFS, we are asking eval submitters to add in as many Eval Samples (at least 5) from their contribution here: <details> <summary>View evals in JSON</summary> ### Eval ```jsonl {"input": [{"role": "system", "content": "Generate a brief answer using only the provided claims, with no personal opinions or outside knowledge. If there is no answer based on the claims, write 'N-A'."}, {"role": "user", "content": "claim: Two doses of mRNA covid-19 vaccines were observed to be highly effective against symptomatic infection and severe outcomes.\nclaim: COVID-19 vaccines currently authorized in the United States are highly effective in preventing COVID-19-associated hospitalizations in older adults.\nclaim: In summary, vaccines are a powerful tool that can be used to control the COVID-19 pandemic, with high efficacy and tolerable ADRs.\nclaim: Conclusion Overall, we conclude that vaccination against COVID-19 in patients with active malignancies using activated and inactivated vaccines is a safe and tolerable procedure that is also accompanied by a high efficacy.\nclaim: COVID-19 vaccines provide good protection against COVID-19 presentation at primary care/outpatient level, particularly among fully vaccinated individuals.\nquestion: are covid-19 vaccines effective?"}], "ideal": "Summary: Covid-19 vaccines are highly effective at protecting against infection and hospitalization."} {"input": [{"role": "system", "content": "Generate a brief answer using only the provided claims, with no personal opinions or outside knowledge. If there is no answer based on the claims, write 'N-A'."}, {"role": "user", "content": "claim: Lower zinc is a hallmark of depression, while increments in serum zinc and attenuation of the immune-inflammatory response during treatment appear to play a role in the clinical efficacy of sertraline.\nclaim: An increase in dietary zinc and higher plasma zinc levels may reduce the risk of depressive symptoms.\nclaim: Although decreased zinc levels have been implicated in the genesis of depression in animal models and in major depressive disorder in humans, this study provides the first evidence of a role for zinc in depression in people with dementia and highlights zinc metabolism as a therapeutic target.\nclaim: The results of this study show that long-term intake of zinc may modulate symptoms of depression.\nclaim: The reported results indicated that the serum zinc level might be a marker of depression as a state (state marker) in treatment responsive patients.\nquestion: can zinc help treat depression?"}], "ideal": "Summary: All of these studies suggest that low zinc levels are a marker of depression and that intake of zinc may have the ability to help reduce symptoms of depression"} {"input": [{"role": "system", "content": "Generate a brief answer using only the provided claims, with no personal opinions or outside knowledge. If there is no answer based on the claims, write 'N-A'."}, {"role": "user", "content": "claim: The findings suggest that the following characteristics of the founder significantly influence the success potential of an incubated venture: entrepreneurial personality, motivation for starting the venture, managerial skills, and approach towards innovation.\nclaim: Using a sample of 384 entrepreneurs selected from the two leading business districts in Uganda, we observe that optimism is the component of psychological capital that significantly moderates the relationship between startup capital and entrepreneurial success.\nclaim: Both startup capital and psychological capital are significant predictors of entrepreneurial success; however, psychological capital is the better predictor.\nclaim: Entrepreneurially self\u2010efficacious founder/managers may help improve the performance of very young firms but such benefits dissipate over time.\nclaim: This finding indicates that the entrepreneurial team\u2019s startup experience plays stronger roles in venturing profitable startups when the amount of financial resources and initial firm size are small; however, the team\u2019s startup experience and intangible resources have positive interaction effects on new-born startups\u2019 profitability.\nquestion: what predicts success as a startup founder?"}], "ideal": "Summary: Things like entrepreneurial personality, motivation for starting the venture, managerial skills, previous start-up experience, startup and psychological capital and optimism all predict success as a startup founder"} {"input": [{"role": "system", "content": "Generate a brief answer using only the provided claims, with no personal opinions or outside knowledge. If there is no answer based on the claims, write 'N-A'."}, {"role": "user", "content": "claim: While homelessness is ultimately the result of a severe and chronic shortage of affordable housing, creating accessible, safe, pet-friendly shelter and safe haven options and instituting a smoother, more transparent process for moving from the streets could substantially reduce street homelessness.\nclaim: - To prevent the revolving door to homelessness, it is necessary to remove the barriers that hinder access to normal health resources which are experienced by people suffering from social exclusion, while implementing ongoing support programmes for homeless people or those at risk of homelessness, which primarily deal with health issues.\nclaim: We conclude that overcoming homelessness requires policies and practices that give a greater focus to non-material aspects of homelessness through an emphasis on empowerment, self-respect and autonomy.\nclaim: This finding suggests that homelessness can be reduced by appropriate clinical interventions if housing is available.\nclaim: For homelessness prevention, systematic and outreach social medical care before and during homelessness should be provided.\nquestion: What are effective ways to prevent homelessness?"}], "ideal": "Summary: Ways to prevent homelessness include creating accessible, safe shelter and safe haven options, removing barriers to health resources, giving a greater focus to non-material aspects of homelessness, and providing systematic and outreach social medical care."} {"input": [{"role": "system", "content": "Generate a brief answer using only the provided claims, with no personal opinions or outside knowledge. If there is no answer based on the claims, write 'N-A'."}, {"role": "user", "content": "claim: While homelessness is ultimately the result of a severe and chronic shortage of affordable housing, creating accessible, safe, pet-friendly shelter and safe haven options and instituting a smoother, more transparent process for moving from the streets could substantially reduce street homelessness.\nclaim: - To prevent the revolving door to homelessness, it is necessary to remove the barriers that hinder access to normal health resources which are experienced by people suffering from social exclusion, while implementing ongoing support programmes for homeless people or those at risk of homelessness, which primarily deal with health issues.\nclaim: We conclude that overcoming homelessness requires policies and practices that give a greater focus to non-material aspects of homelessness through an emphasis on empowerment, self-respect and autonomy.\nclaim: This finding suggests that homelessness can be reduced by appropriate clinical interventions if housing is available.\nclaim: For homelessness prevention, systematic and outreach social medical care before and during homelessness should be provided.\nquestion: How to prevent homelessness?"}], "ideal": "Summary: Ways to prevent homelessness include creating accessible, safe shelter and safe haven options, removing barriers to health resources, giving a greater focus to non-material aspects of homelessness, and providing systematic and outreach social medical care."} {"input": [{"role": "system", "content": "Generate a brief answer using only the provided claims, with no personal opinions or outside knowledge. If there is no answer based on the claims, write 'N-A'."}, {"role": "user", "content": "claim: The findings revealed that the factor that contributes the most to entrepreneurship intention is Locus of control, followed by Need of Achievement and Subjective Norms.\nclaim: It was found that entrepreneurial skill, environmental factors and entrepreneurial orientation have a positive influence on entrepreneurial intention.\nclaim: The findings indicate that entrepreneurial motivation has a significant correlation with entrepreneurial intention and its three determinants, social valuation of entrepreneurship, having entrepreneurial role models, knowledge of entrepreneurial support and perceived barriers to starting a business.\nclaim: Research finding revealed that entrepreneurial intention is indirectly affected by entrepreneurship education, meaning that students\u2019 entrepreneurial motivation and attitude are two important mediating variables.\nclaim: Findings confirm the influence of individual and socio-cultural factors on entrepreneurial intention.\nquestion: What are the factors of entrepreneurship intention"}], "ideal": "Summary: Studies find that intrinsic factors, such as entrepreneurial skill and motivation, as well as extrinsic variables, such as the environmental support of entrepreneurship, mediate entrepreneurship intention."} {"input": [{"role": "system", "content": "Generate a brief answer using only the provided claims, with no personal opinions or outside knowledge. If there is no answer based on the claims, write 'N-A'."}, {"role": "user", "content": "claim: The results show that digital agriculture is able to help users to increase productivity in a sustainable way.\nclaim: Digital agriculture technologies continue the centralization of economic knowledge and power as they facilitate the transformation of vast territories into \u201coperational landscapes\u201d that provide the material, energy, and labor for a rapidly expanding urban system.\nclaim: The digital agriculture system is an effective tool for insurance industry to use to develop a dynamical business plan for the changing climate.\nclaim: The technical fitting-out of agriculture in the digital economy should be considered as a set of measures to prepare the industry for the production of high-quality products, which implies the use of digital technologies that minimize human participation in the production process.\nclaim: Consequently, the initial Mobile-based Information System evolved into a Digital Knowledge Ecosystem that can predict current production situation in near real enabling government agencies to dynamically adjust the incentives offered to farmers for growing different types of crops to achieve sustainable agriculture production through crop diversification.\nquestion: What is digital agriculture?"}], "ideal": "Summary: N-A"} ``` </details>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Thank you for contributing an eval!♥️
🚨 Please make sure your PR follows these guidelines, failure to follow the guidelines below will result in the PR being closed automatically. Note that even if the criteria are met, that does not guarantee the PR will be merged nor GPT-4 access granted. 🚨
PLEASE READ THIS:
In order for a PR to be merged, it must fail on GPT-4. We are aware that right now, users do not have access, so you will not be able to tell if the eval fails or not. Please run your eval with GPT-3.5-Turbo, but keep in mind as we run the eval, if GPT-4 gets higher than 90% on the eval, we will likely reject since GPT-4 is already capable of completing the task.
We plan to roll out a way for users submitting evals to see the eval performance on GPT-4 soon. Stay tuned! Until then, you will not be able to see the eval performance on GPT-4. Starting April 10, the minimum eval count is 15 samples, we hope this makes it easier to create and contribute evals.
Also, please note that we're using Git LFS for storing the JSON files, so please make sure that you move the JSON file to Git LFS before submitting a PR. Details on how to use Git LFS are available here.
Eval details 📑
Eval name
Consensus Summary
Eval description
Utilize the model's ability to produce a Scientific Consensus in response to a scientific inquiry using the provided claims.
What makes this a useful eval?
This is a useful eval because it evaluates the model's ability to produce a scientific consensus in response to a given set of claims. This is important because scientific consensus is the result of multiple studies and data that may or may not support the same conclusion. A model that can accurately produce scientific consensus can help in making informed decisions and policies based on scientific evidence. Hence, evaluating a model's ability to produce a scientific consensus using the Consensus Summary eval can be useful in assessing its reliability and potential for practical applications.
Criteria for a good eval ✅
Below are some of the criteria we look for in a good eval. In general, we are seeking cases where the model does not do a good job despite being capable of generating a good response (note that there are some things large language models cannot do, so those would not make good evals).
Your eval should be:
Basicevals or theFactModel-graded eval, or an exhaustive rubric for evaluating answers for theCriteriaModel-graded eval.If there is anything else that makes your eval worth including, please document it below.
Unique eval value
Eval structure 🏗️
Your eval should
evals/registry/data/{name}evals/registry/evals/{name}.yaml(For now, we will only be approving evals that use one of the existing eval classes. You may still write custom eval classes for your own cases, and we may consider merging them in the future.)
Final checklist 👀
Submission agreement
By contributing to Evals, you are agreeing to make your evaluation logic and data under the same MIT license as this repository. You must have adequate rights to upload any data used in an Eval. OpenAI reserves the right to use this data in future service improvements to our product. Contributions to OpenAI Evals will be subject to our usual Usage Policies (https://platform.openai.com/docs/usage-policies).
Email address validation
If your submission is accepted, we will be granting GPT-4 access to a limited number of contributors. Access will be given to the email address associated with the merged pull request.
Limited availability acknowledgement
We know that you might be excited to contribute to OpenAI's mission, help improve our models, and gain access to GPT-4. However, due to the requirements mentioned above and high volume of submissions, we will not be able to accept all submissions and thus not grant everyone who opens a PR GPT-4 access. We know this is disappointing, but we hope to set the right expectation before you open this PR.
Submit eval
pip install pre-commit; pre-commit installand have verified thatblack,isort, andautoflakeare running when I commit and pushFailure to fill out all required fields will result in the PR being closed.
Eval JSON data
Since we are using Git LFS, we are asking eval submitters to add in as many Eval Samples (at least 5) from their contribution here:
View evals in JSON
Eval
{"input": [{"role": "system", "content": "Generate a brief answer using only the provided claims, with no personal opinions or outside knowledge. If there is no answer based on the claims, write 'N-A'."}, {"role": "user", "content": "claim: Two doses of mRNA covid-19 vaccines were observed to be highly effective against symptomatic infection and severe outcomes.\nclaim: COVID-19 vaccines currently authorized in the United States are highly effective in preventing COVID-19-associated hospitalizations in older adults.\nclaim: In summary, vaccines are a powerful tool that can be used to control the COVID-19 pandemic, with high efficacy and tolerable ADRs.\nclaim: Conclusion Overall, we conclude that vaccination against COVID-19 in patients with active malignancies using activated and inactivated vaccines is a safe and tolerable procedure that is also accompanied by a high efficacy.\nclaim: COVID-19 vaccines provide good protection against COVID-19 presentation at primary care/outpatient level, particularly among fully vaccinated individuals.\nquestion: are covid-19 vaccines effective?"}], "ideal": "Summary: Covid-19 vaccines are highly effective at protecting against infection and hospitalization."} {"input": [{"role": "system", "content": "Generate a brief answer using only the provided claims, with no personal opinions or outside knowledge. If there is no answer based on the claims, write 'N-A'."}, {"role": "user", "content": "claim: Lower zinc is a hallmark of depression, while increments in serum zinc and attenuation of the immune-inflammatory response during treatment appear to play a role in the clinical efficacy of sertraline.\nclaim: An increase in dietary zinc and higher plasma zinc levels may reduce the risk of depressive symptoms.\nclaim: Although decreased zinc levels have been implicated in the genesis of depression in animal models and in major depressive disorder in humans, this study provides the first evidence of a role for zinc in depression in people with dementia and highlights zinc metabolism as a therapeutic target.\nclaim: The results of this study show that long-term intake of zinc may modulate symptoms of depression.\nclaim: The reported results indicated that the serum zinc level might be a marker of depression as a state (state marker) in treatment responsive patients.\nquestion: can zinc help treat depression?"}], "ideal": "Summary: All of these studies suggest that low zinc levels are a marker of depression and that intake of zinc may have the ability to help reduce symptoms of depression"} {"input": [{"role": "system", "content": "Generate a brief answer using only the provided claims, with no personal opinions or outside knowledge. If there is no answer based on the claims, write 'N-A'."}, {"role": "user", "content": "claim: The findings suggest that the following characteristics of the founder significantly influence the success potential of an incubated venture: entrepreneurial personality, motivation for starting the venture, managerial skills, and approach towards innovation.\nclaim: Using a sample of 384 entrepreneurs selected from the two leading business districts in Uganda, we observe that optimism is the component of psychological capital that significantly moderates the relationship between startup capital and entrepreneurial success.\nclaim: Both startup capital and psychological capital are significant predictors of entrepreneurial success; however, psychological capital is the better predictor.\nclaim: Entrepreneurially self\u2010efficacious founder/managers may help improve the performance of very young firms but such benefits dissipate over time.\nclaim: This finding indicates that the entrepreneurial team\u2019s startup experience plays stronger roles in venturing profitable startups when the amount of financial resources and initial firm size are small; however, the team\u2019s startup experience and intangible resources have positive interaction effects on new-born startups\u2019 profitability.\nquestion: what predicts success as a startup founder?"}], "ideal": "Summary: Things like entrepreneurial personality, motivation for starting the venture, managerial skills, previous start-up experience, startup and psychological capital and optimism all predict success as a startup founder"} {"input": [{"role": "system", "content": "Generate a brief answer using only the provided claims, with no personal opinions or outside knowledge. If there is no answer based on the claims, write 'N-A'."}, {"role": "user", "content": "claim: While homelessness is ultimately the result of a severe and chronic shortage of affordable housing, creating accessible, safe, pet-friendly shelter and safe haven options and instituting a smoother, more transparent process for moving from the streets could substantially reduce street homelessness.\nclaim: - To prevent the revolving door to homelessness, it is necessary to remove the barriers that hinder access to normal health resources which are experienced by people suffering from social exclusion, while implementing ongoing support programmes for homeless people or those at risk of homelessness, which primarily deal with health issues.\nclaim: We conclude that overcoming homelessness requires policies and practices that give a greater focus to non-material aspects of homelessness through an emphasis on empowerment, self-respect and autonomy.\nclaim: This finding suggests that homelessness can be reduced by appropriate clinical interventions if housing is available.\nclaim: For homelessness prevention, systematic and outreach social medical care before and during homelessness should be provided.\nquestion: What are effective ways to prevent homelessness?"}], "ideal": "Summary: Ways to prevent homelessness include creating accessible, safe shelter and safe haven options, removing barriers to health resources, giving a greater focus to non-material aspects of homelessness, and providing systematic and outreach social medical care."} {"input": [{"role": "system", "content": "Generate a brief answer using only the provided claims, with no personal opinions or outside knowledge. If there is no answer based on the claims, write 'N-A'."}, {"role": "user", "content": "claim: While homelessness is ultimately the result of a severe and chronic shortage of affordable housing, creating accessible, safe, pet-friendly shelter and safe haven options and instituting a smoother, more transparent process for moving from the streets could substantially reduce street homelessness.\nclaim: - To prevent the revolving door to homelessness, it is necessary to remove the barriers that hinder access to normal health resources which are experienced by people suffering from social exclusion, while implementing ongoing support programmes for homeless people or those at risk of homelessness, which primarily deal with health issues.\nclaim: We conclude that overcoming homelessness requires policies and practices that give a greater focus to non-material aspects of homelessness through an emphasis on empowerment, self-respect and autonomy.\nclaim: This finding suggests that homelessness can be reduced by appropriate clinical interventions if housing is available.\nclaim: For homelessness prevention, systematic and outreach social medical care before and during homelessness should be provided.\nquestion: How to prevent homelessness?"}], "ideal": "Summary: Ways to prevent homelessness include creating accessible, safe shelter and safe haven options, removing barriers to health resources, giving a greater focus to non-material aspects of homelessness, and providing systematic and outreach social medical care."} {"input": [{"role": "system", "content": "Generate a brief answer using only the provided claims, with no personal opinions or outside knowledge. If there is no answer based on the claims, write 'N-A'."}, {"role": "user", "content": "claim: The findings revealed that the factor that contributes the most to entrepreneurship intention is Locus of control, followed by Need of Achievement and Subjective Norms.\nclaim: It was found that entrepreneurial skill, environmental factors and entrepreneurial orientation have a positive influence on entrepreneurial intention.\nclaim: The findings indicate that entrepreneurial motivation has a significant correlation with entrepreneurial intention and its three determinants, social valuation of entrepreneurship, having entrepreneurial role models, knowledge of entrepreneurial support and perceived barriers to starting a business.\nclaim: Research finding revealed that entrepreneurial intention is indirectly affected by entrepreneurship education, meaning that students\u2019 entrepreneurial motivation and attitude are two important mediating variables.\nclaim: Findings confirm the influence of individual and socio-cultural factors on entrepreneurial intention.\nquestion: What are the factors of entrepreneurship intention"}], "ideal": "Summary: Studies find that intrinsic factors, such as entrepreneurial skill and motivation, as well as extrinsic variables, such as the environmental support of entrepreneurship, mediate entrepreneurship intention."} {"input": [{"role": "system", "content": "Generate a brief answer using only the provided claims, with no personal opinions or outside knowledge. If there is no answer based on the claims, write 'N-A'."}, {"role": "user", "content": "claim: The results show that digital agriculture is able to help users to increase productivity in a sustainable way.\nclaim: Digital agriculture technologies continue the centralization of economic knowledge and power as they facilitate the transformation of vast territories into \u201coperational landscapes\u201d that provide the material, energy, and labor for a rapidly expanding urban system.\nclaim: The digital agriculture system is an effective tool for insurance industry to use to develop a dynamical business plan for the changing climate.\nclaim: The technical fitting-out of agriculture in the digital economy should be considered as a set of measures to prepare the industry for the production of high-quality products, which implies the use of digital technologies that minimize human participation in the production process.\nclaim: Consequently, the initial Mobile-based Information System evolved into a Digital Knowledge Ecosystem that can predict current production situation in near real enabling government agencies to dynamically adjust the incentives offered to farmers for growing different types of crops to achieve sustainable agriculture production through crop diversification.\nquestion: What is digital agriculture?"}], "ideal": "Summary: N-A"}