Abstract- A ‘hero’ in a software project is a highly skilled developer who operates almost single-handedly to make the project a success. To be precise, a software project is said to have ‘hero’ developers if more than 80% of contributions are made by less than 20% of the developers. In literature, such projects are deplored as they might slow down the project by leading to a bottleneck between development and communication. However, there is little to none evidence for the same.
Recent studies show that hero projects are very common and significant, particularly in medium to large development projects. This report explores the effect of having heroes in a project and has been drafted after studying several research papers, based on data and statistics collected from opensource projects.
Keywords- Heroes, Software, Development, Projects.
A project is mostly initiated by a project leader, who remains a contributor of the project for the longest duration and in literature, is known as the hero/core developer of the project.
Literature[4, 3, 6] typically defines a project as being a ‘hero’ project if more than 80% work is done by less than 20% of the developers and disapproves[8, 9, 5] of such projects as such heroes are believed to act as a bottleneck and hamper the speed of project development. Literature however presents no thorough study or research based on evidence to back this belief. A statistical analysis of data collected for open-source projects led to the following analysis:
Hero Projects are very common in public as well as enterprise projects.
The number of heroes in a project does not affect the process of project development. The following exceptions hold:
- A project of large size needs a hero.
- Rate of enhancement increase with the number of heroes.
The results are surprising as according to the literature, it is expected that a hero will have a negative impact on software development projects, where the work is meant to be divided amongst a large group of developers. The consequent sections of this report show how we reached these conclusions, and build around the following research questions:
RQ1. How prevalent are hero projects?
About 77% of projects are hero projects. The number includes Public and Enterprise projects.
RQ2. How does team size affect the prevalence of these projects?
As team size increases the need for a hero also rises.
RQ3. Do hero projects promise better software quality?
This is determined, based on 6 measures- number of bugs, issues, and enhancements being resolved and their response-times. No statistical difference was found between the percentage/response-time of issues and bugs being resolved within hero and non-hero projects. However, enterprise and public projects closed statistically more and fewer enhancement issues respectively. There was no difference in the response-time though.
Thus, heroes are far more valuable assets than those suggested by the literature. The document presented is structured as follows: The section next to the introduction contains the literature review, followed by the experimental details. The questions posed above are answered in detail in section 4. Section 5 discusses the result. Section 6 and 7 present the validity and conclusion of the report, respectively.
A large project such as an enterprise/public project offers a number of developer roles, including project leader, core developer, peripheral developer, active developer, bug reporter, bug fixers, and passive users. Amongst them, a core developer can make a good project leader. A code developer contributes almost 80% of the code. They implement maximum code changes and make important decisions. The contribution of a core developer can be established by determining the number of lines of codes they changes or the number of commits pushed by the developer. By studying the works of Pinto et al. , it was found that only 1.73% of the total number of commits were made by 48% of peripheral developers. About 28.6% of these are trivial, 30.2% fixed bugs. Only 18.7% contributed to the new features. This uneven division motivates this research.
2.2 LITERATURE REVIEW
Heroism in software development has been a widely studied topic. Many researchers have pointed out the presence of heroin software development projects. For example:
Peterson  performed an analysis of the software development process on GitHub and found out development is mostly done by a small group of developers.
In 2002, Koch et al.  studied the GNOME project and showed the presence of heroes throughout the project history.
In 2005, Krishnamurthy  studied about 100 open-source projects and found that a few individuals were responsible for the success of project in most cases.
Researchers have also commented on the bottleneck problem arising due to heroes and the lack of collaboration and understanding in the team.
Distributed coding efforts can give rise to agile community-based programming practices which can lead to:
- The decrease in response time of the issues/bugs/enhancements resolution.
- The surge in the number of issues/bugs/enhancements being re-solved.
In conclusion, most literature except a few research papers has established a truism that heroes are harmful to a project, without any empirical evidence and without investigating its value in a public/enterprise project. The report that is presented here has taken into consideration all the aspects of having a hero and stated conclusions based on solid evidence and proof.
- Collaboration: A project must have a minimum of one pull request, which is indicative of how many peripheral developers are there.
- Commits: The project should have a minimum of 20 commits.
- Duration: The project must have been active for 50 weeks.
- Issues: There should be more than 10 issues.
- Personal Purpose: The project should be collaborative and have a minimum of 8 contributors.
- Releases: At least one release.
- Software Development: Project should only be a placeholder.
The projects hosted on GitHub were tested for the above-mentioned criteria and the projects left after discarding the non-satisfying projects were used for the experiment. The process was repeated for the public as well as enterprise data sources.
3.2 Metric Extraction
To find answers to the research questions, earlier posed, we extract the number of commits made by an individual developer in a project, from our research database. If it is found that the relation that more than 80% of contributions made by less than 20% developers, then it is added in the list of ‘hero’ projects.
Note that GitHub allows merging pull requests from external developers and if merged, these contributors are added to the merger contributor list as well. These merges lead to over-inflation of the “Hero effect”, hence, we have not included such pull requests.
Next, we classify these projects in our database based on the sizes of developer teams, into small, medium, and large teams.
Small teams: 8 < #developers < 15
Medium teams: 15 < #developers < 30
Large teams: #developers > 30
Next, we shall define 6 metrices:
Ir _It = Total #issues_closedTotal #issues_created
Br_Bt = Total #bug_tagged_issues_closedTotal #bug_tagged_issues_createdEr_Er = Total #enhancement_tag_issues_closedTotal #enhancement_tag_issues_createdIRt = Median time taken to resolve issues
BRt = Median time taken to resolve bug issues
ERt = Median time is taken to resolve enhancement issues
3.3 Statistical Tests
We use two tests for comparing results between ‘hero’ and ‘non-hero’- Significance test that is useful for detecting if two populations differ merely by random noise and the Effect size Test that is useful for checking if two populations differ by more than just a trivial amount. The Scott-Knott method is used for the significance test. This technique arranges the numbers recursively into bi-clusters. If any two clusters are not statistically different, Scott-Knott displays both as a single group.
Note: The data and figures are pertaining to the research data and analysis presented in one of the research papers.
4.1 How Prevalent Are Heroes?
Keep in mind that we define the project as a hero when 80% of the “contribution” is about 20% of the developers. To evaluate the spread of such projects,
4.2 How Does Team Size Affect Its Prevalence?
Figures 2 and 3 show the distribution of Hero and non-hero projects between different team sizes in General and Corporate projects. The project’s dependency on heroes grows with its size. We note that the benefits of having heroes, where a small group handles complex interactions seen in large-scale projects, outweigh the theoretical flaws of heroes.
Figure 2: Public Projects- Hero and Non-hero projects for different team sizes
Figure 3: Enterprise Projects- Hero and Non-hero projects for different team sizes
The figures show that, there is no effect of heroes or non-heroes on the time required to close issues, bugs, and enhancements.
There are certain threats to the validity of the experiment presented. But, then there is all experiments stand high over the base of assumptions and exceptions. Hence, it will not be wrong to present the following conclusion from our analysis:
The wisdom of literature is to write off “heroes”, i.e. a small percentage of the staff responsible for the greatest progress of the project. After statistically analyzing loads of Enterprise/Public GitHub projects, we assert that it’s time to review that wisdom:
- Most projects are hero projects, especially when we consider medium to large projects. That is, debating the option of avoiding heroes
- really only matters for smaller tasks.
- Heroes do not significantly affect the rate of closing of issues and bugs.
- They also do not affect the time needed to address issues, bugs, or enhancements.
- Heroes have a positive impact on the speed with which enhancement issues are dealt with within the Enterprise project.
The only place where our conclusions agree with rooted wisdom is the rate of enhancement of non-heroic public projects. That said, the benefits for projects that aren’t heroes are rare. we extracted the above features and classified these projects as heroic and non-heroic.
From Figure 1, we can observe around 80% of public and corporate projects are led by the hero or central developers. One explanation for so many heroes is that our results may be erroneous, and merely results of the “building effect”. But due to our previous filtering criterion and large database, we can claim: that is not the case. The only possible cause for so many heroes is the nature of software development. Take, for example, GitHub’s opensource projects, which are often initiated by a project manager who is responsible for maintaining and moderating the project. Until the project becomes popular, only the leader will be responsible for the contributions related to the main code. Once the project has stabilized and gained popularity, the fixes for ongoing issues/bugs/fixes are just a few lines created by peripheral developers. Such a result naturally leads to heroes. Whatever the reason, the pattern is very clear. Hero Projects are very common.
Figure 1: Showing that the hero projects are very common
4.3 Do Hero Projects Promise Better Software Quality?
We shall address this question in two parts, in terms of efficiency and accuracy.
4.3.1 Do hero projects resolve more bug and enhancement issues?
As can be seen in figures 4 and 5, the following results hold valid:
In public projects, heroes close the least amount of upgrade problems.
In Enterprise projects, heroes solve the most important problems.
The heroes in the development of the enterprise lead to greater control of that project.
In short, our experience calls for a review of long-held software engineering beliefs. Software heroes are far more common and valuable than literature has indicated, especially for the development of medium to large projects. Organizations should think of better ways to find and keep more of these heroes.
In the figures, the X-axis separates Hero and Non-hero projects. On the X-axis, each label is further labeled with ranks which is the result of a statistical comparison of the two populations using the Scott-Knott test. Two populations with the same rank are indistinguishable.
Note: Figures 4, 5, 6, and 7 are given in the appendix (pg-7) of the document.
4.3.2 Does Having a Hero Improve the Response Time of Resolving an Issue?
Figures 6 and 7 show a boxplot for comparing the time of resolving issues in hero and non-hero projects.
 Amritanshu Agrawal, Akond Rahman, Rahul Krishna, Alexander Sobran* and Tim Menzies- Computer Science, NCSU, USA; IBMCorp*, Research Triangle, North Carolina. We Don’t Need Another Hero?
 Yunwen Ye and Kouichi Kishida. 2003. Toward an understanding of the motivation Open Source Software developers. In Proceedings of the 25th international conference on software engineering. IEEE Computer Society, 419–429.
 MR Martinez Torres, SL Toral, MP erales, and F Barrero.2011. Analysis of the core team role in opensource communities. In Complex, Intelligent and Software Intensive Systems (CISIS), 2011 International Conference on.IEEE,109–114.
 Jason Tsay, Laura Dabbish, and James Herbsleb.2014. Influence of social and technical factors for evaluating contribution in GitHub. In Proceedings of the 36th
international conference on Software engineering. ACM, 356–366.
 Trevor Wood-Harper and Bob Wood. 2005. Multi view asocial informatics in action: past, present and future. Information Technology & People 18,1(2005), 26–32.
 Kazuhiro Yamashita ,Shane McIntosh, Yasutaka Kamei, Ahmed E Hassan, and Naoyasu Ubayashi. 2015. Revisiting the applicability of the pareto principle to core development teams in opensource software projects. In Proceedings of the 14th International Workshop on Principles of Software Evolution.ACM,46–55.
 Gustavo Pinto, Igor Steinmacher, and Marco Aurélio Gerosa. 2016. More common than you think: Anin- depth study of casual contributors. In Software Analysis, Evolution, and Reengineering (SANER), 2016 IEEE 23rd International Conference on,Vol.1.IEEE,112–123.
 Barry Boehm. 2006. A view of 20th and 21st century software engineering. In Proceedings of the 28th international conference on Software engineering. ACM, 12–29.
 Jordi Cabot, Javier Luis Cánovas Izquierdo, Valerio Cosentino, and Belén Rolandi. 2015. Exploring the use of label to categorize issues in open-source software projects. In Software Analysis, Evolution and Reengineering (SANER).
 Suvodeep Majumder, IEEE Member, Joymallya Chakraborty, Amritanshu Agrawal, Tim Menzies, IEEE Fellow. Why Software Projects need Heroes (Lessons Learned from 1100+ Projects)
 Kevin Peterson. The GitHub open source development process. 12 2013.
 Stefan Koch and Georg Schneider. E ort, co-operation and co-ordination in an open source software project: Gnome. Information Systems Journal.
 Sandeep Krishnamurthy. Cave or community? an empirical examination of 100 mature open source projects (originally published in volume 7, number 6, June 2002).
Cite this essay
Heroes in Software Projects. (2020, Sep 03). Retrieved from https://studymoose.com/heroes-in-software-projects-essay