Evaluating data quality collected by volunteers for first-level inspection of hydraulic structures in mountain catchments

Volunteers have been trained to perform first-level inspections of hydraulic structures within campaigns promoted by civil protection of Friuli Venezia Giulia (Italy). Two inspection forms and a learning session were prepared to standardize data collection on the functional status of bridges and check dams. In all, 11 technicians and 25 volunteers inspected a maximum of six structures in Pontebba, a mountain community within the Fella Basin. Volunteers included civil-protection volunteers, geosciences and social sciences students. Some participants carried out the inspection without attending the learning session. Thus, we used the mode of technicians in the learning group to distinguish accuracy levels between volunteers and technicians. Data quality was assessed by their accuracy, precision and completeness. We assigned ordinal scores to the rating scales in order to get an indication of the structure status. We also considered performance and feedback of participants to identify corrective actions in survey procedures. Results showed that volunteers could perform comparably to technicians, but only with a given range in precision. However, a completeness ratio (question / parameter) was still needed any time volunteers used unspecified options. Then, volunteers’ ratings could be considered as preliminary assessments without replacing other procedures. Future research should consider advantages of mobile applications for data-collection methods.


Introduction
There is an increasing interest in the use of citizen-based approaches to better understand the environment and hazardrelated processes. To that end, there are different datacollection approaches according to the citizens' skills and time of involvement, e.g., crowdsourcing (Hudson-Smith et al., 2008), volunteered geographic information (Goodchild, 2007) or facilitated-volunteered geographic information (Seeger, 2008). Moreover, scientists have increasingly considered management approaches based upon the broader concept of citizen science (Bonney et al., 2009). Thereby, volunteers are enlisted and trained according to survey and management needs (Devictor et al., 2010).
In disaster risk management, citizen science is linked to European and worldwide directives, such as the Hyogo framework (European Commission, 2007;United Nations, 2005). Such directives promote citizen involvement to build a culture of resilience before, during and after a disaster strikes (European Commission, 2012). Therefore, modern approaches for emergency management promote exchange of information between local authorities and volunteer groups to support preparedness and preventive actions (Enders, 2001).
Hydro-meteorological events in mountain areas are often caused by multiple and sudden onset floods and debris flows. Traditionally, hazard mitigation in the European Alps is mainly organized by implementing structural measures. However, the increasing frequency and influence of flow and sediment processes also affect the functional status of hydraulic structures, and vice versa (Holub and Hübl, 2008).
The impact (i.e., damage) is evident to structures for debris-flow control, such as check dams. Evidence is also found in the potential aggravation of flood hazard at the location of bridges and culverts due to blocking material such as debris, large wood and other residues (Mazzorana et al., 2010). Stability of protection works is often threatened by the erosion level at stream banks.
The need to enhance data-collection approaches to support risk-management strategies is widely acknowledged (e.g., Molinari et al., 2014). Besides situations where financial and human resources are limited, scientific monitoring may be subject to additional complexity under dynamic environmental conditions or remote settings (de Jong, 2013).
Moreover, frequent inspection of hydraulic structures is especially important in mountain basins. Therefore, opportunities in promoting citizen science projects stem from the increasing frequency, timeliness and coverage of surveillance activities (Flanaging and Metzger, 2008). To be useful, survey procedures should be tested and adapted according to quality requirements of decision-makers (Bordogna et al., 2014;EPA, 1997;Goodchild and Li, 2012;Gouveia and Fonseca, 2008).
In spite of the challenges for citizen involvement, precision and completeness of largely collected data depends on the exhaustiveness of the inspection procedures (Galloway et al., 2006). Therefore, training activities are often required before starting the inspection campaigns. However, the extension of these training sessions should consider available time, number and type of participants (Tweddle et al., 2012). To that end, Jordan et al. (2011) suggested that identification of technical data should be restricted while more general indicators can still be accurately obtained. Those indicators may be quantitative and qualitative aspects that are easily recognizable from visual inspections (Gommerman and Monroe, 2012;Gouveia et al., 2004). Then, qualitative field methods are generally based on rating scales to report inspected conditions. This study considers regular inspections with citizenvolunteer groups that are promoted by civil protection and local authorities of Friuli Venezia Giulia (FVG), Italy. We involved 11 technicians and 25 volunteers on a data-collection exercise. Participants were invited according to their location. Thus, 15 out of 25 volunteers were members of civil-protection groups in neighboring municipalities. In all, 10 university students were also volunteers within a supplementary academic activity.
In this paper, we evaluate data quality on preliminary inspections of bridges and check dams. Therefore, we address the following research questions: (1) how well were participants able to report on the functional status by distinguishing between available rating classes? (2) How effectively were data collected by volunteers compared to those collected by technicians? (3) How can survey procedures be improved? To that end, Sect. 2 describes the methodology for data collection. In Sect. 3, we evaluate data quality by their accuracy, precision and completeness. Finally, we highlight in the discussion's and conclusion's (Sects. 4 and 5) key points for the practical use of citizen-based data.

Methods
In the first step of the methodology, we defined target groups. The participants' groups comprised of volunteers and technicians to evaluate the data quality. In the second step, two inspection forms were designed for bridges and check dams. The forms were created to carry out inspections with trained volunteers.
Despite available procedures for technicians, the volunteers' involvement demands more structured and simpler forms to inspect the functional status. Similar to Yetman (2002), we used rating scales to standardize collected data in distinguishing minor problems from more serious concerns. The rating scales included visual schemes to guide inspectors. In addition, the latter were required to take a photo to support their choices.
Finally, we organized a data-collection exercise to carry out first-level inspection of six structures, hereafter referred to as "inspection tests". There was 1 day for the training session and 1 day for the inspection tests. Participants were divided between control and learning groups to identify potential improvements in survey procedures.

Participants' groups
Citizens were involved in the form of civil-protection volunteers due to safety limitations and accessibility to hydraulic structures in mountain catchments. Citizens enrolled as civilprotection volunteers traditionally received formative, informative and safety procedures while specialized training is selectively provided (Protezione Civile FVG, 2009). In addition, we widened the range of participants with students to account for assumed differences in preliminary knowledge to fill the form. Then, volunteers included geology students and students from a master's course on cooperation, both from the University of Trieste. The technicians were employees from regional services with competences for the inspection of hydraulic structures in the mountain community.
Volunteers (Vs) and technicians (Ts) joined the activity according to their time-availability (Table 1). The control group (CG) carried out the inspection tests without attending the learning session. Most citizen-volunteers were present in the learning group (LG) as they are the target group of campaigns promoted by civil protection. Students of geosciences were only available for the inspection tests during the first day. Their involvement was important to facilitate knowledge exchange during outdoor learning. Then, they were equally divided into a volunteers' learning group (VLG) and a volunteers' control group (VCG).
During registration to the data-collection exercise, participants were asked to fill a questionnaire to characterize participants' groups (Table A1). Gathered information included demographics such as age, gender and level of education, as well as period of residence in the Fella Basin and FVG region. In addition, we measured the experience with hydrometeorological hazards (i.e., floods, debris flows and landslides). The questionnaire also included 20 questions to assess a prior knowledge of participants of debris-flow phenomena, functionality of check dams and culverts, as well as emergency security guidelines. Table 2 summarizes the forms' layouts divided by sections. We defined the latter with four risk managers of civil protection, the geological survey and the forestry service of FVG.

Design of the inspection forms
Section I identifies the inspector. Section II comprises simplified information on location, type, use and presence of connected structures, if available. Section III accounts for the level of accessibility, presence of stream water, occurrence of rainfall and snow. Thus, section III becomes relevant for comparing between campaigns carried out at different time periods. Thereafter, section IV of the form refers to the functional status of the inspected structure. Functional status is the susceptibility or physical conditions of the structure that may affect the function type for which it was designed or built (Uzielli et al., 2008). Furthermore, the functional status is inspected by looking at three parameters, according to the structure type.
Parameters in section IV are comprised of a maximum of four questions. For example, questions for check dams in parameter A are: (1) is the stream flow passing where it should be? (2) What is the status of the check dam? (3) How visible is the basis of the structure? (4) Is there any protection for scouring at the downstream bottom of the check dam? Questions, rating options and visual schemes were defined according to inspection procedures for technicians. Those were mainly for check dams (Jakob and Hungr, 2005;Provinzia Autonoma di Bolzano, 2006;von Maravic, 2010;Province of British Columbia, 2000) and bridges (Burke Engineering, 1999;Ohio Department of Transportation, 2010;Servizio Forestale FVG, 2002). The inspection forms adopted in this study are available as the Supplement.
For the case of bridges, parameter A focuses on the opening for the water flow and erosion of the pillar or abutments. Parameter B assesses levels of lateral obstruction, either at the structure location or at the stream channel. Therefore, questions for these parameters were aimed at identifying morphological changes immediately upstream, downstream and at the structure location. Such changes relate to either local erosion at protection structures and abutments, deposition phenomena that is somehow perennial with presence of vegetation, or clogging of critical flow sections. Finally, it accounts for additional elements, such as pipes, when they reduce the stream cross section.
We also included a question referring to the maximum free height of the structure. However, it was, in the end, not considered due to safety limitations for citizen volunteers when accessing the stream channel. Limitations are based on the dynamic distribution of deposits and eroding surfaces along steep mountain channels (Remaître et al., 2005). Therefore, volunteers should follow safety procedures according to the environmental and meteorological conditions during the inspection period.
For check dams, the focus of parameter A is on the status of the structure itself and downstream scouring. Parameter B distinguishes between consolidation and open check dams. Then, upstream obstruction is limited to the open check dam type. That distinction is due to the relevance of open check dams for retention of sediments, if there is a retention basin connected to the structure. Therefore, we included a "Does not apply" option for inspecting consolidation check dams.
In contrast, parameter C addresses the same questions for bridges and check dams. It refers to the worst condition while looking at the presence of protection works and erosion level at the stream banks. Then, we established a control distance of 20 m upstream and downstream of the structure. This distance was defined to reduce variability of assessments during the inspection. The 20 m allow inspectors to observe and to take pictures, even if accessibility to the structure is restricted. Section V of the form reports the critical infrastructure within the same control distance. Finally, section VI distinguishes required actions to follow up the inspection based on the options provided in the form.
Data-quality evaluation focuses on sections IV and VI of the inspection form. Table 3 summarized the rating scales we used. When the question itself did not specify the location to report, a multiple choice was included by specifying the problem's location: right, left or in correspondence with the structure. In addition, all questions had alternative options to report unspecified answers such as "I don't know" and "Could not be answered". The latter represents conditions at the structure location (e.g., water level) that did not allow inspectors to provide an assessment.
In addition, we assigned ordinal scores to the rating classes to get an indication on the functional status. For the dataquality evaluation, we aggregated scores according to the given range in precision while generalizing the rating scales.

Data-collection exercise
The data-collection exercise was carried out in the municipality of Pontebba (FVG, Italy) within the mountain community in the Fella Basin (Fig. 1a). The only settlement with over 4000 inhabitants is Tarvisio, bordered by Austria and Slovenia. The Fella catchment has an area of 700 km 2 , with a mean altitude of 1140 m a.s.l. It consists mostly of lime-stone and it is characterized by steep slopes and high tectonic grade. The area is prone to landslides, flash floods and debris flows.
The latest severe alluvial event was on 29 August 2003. The total rainfall amount of the event was equal to 389.6 mm. Detected intensities were particularly strong for values corresponding to 3 and 6 h (Borga et al., 2007). The event caused severe damage, created gullies and expanded existing river beds. The most affected villages were Ugovizza, Valbruna, Malborgehtto and Pontebba (Calligaris and Zini, 2012).
After the 2003 event, technical services updated the inventory of debris flows. Civil protection realized several mitigation measures in the affected areas. The basin authorities produced an updated version of the hazard maps, P.A.I-FELLA (ADBVE, 2012). Thus, 22 % of total check dams and 50 % total of bridges within Fella Basin are accounted at the different hazardous areas defined in the P.A.I. upstream of the Pontebba location. That corresponds to 230 and 115 structures respectively.
Civil protection selected the structures for the inspection tests (Fig. 1b). The complexity of the inspection tests differed according to the functional status of the structures. Then, structures for the inspection tests accounted for a range from minimal to serious concerns. Structures also included connected elements, such as retention basins and secondary check dams for scouring protection.
Finally, Table 4 describes the organization of the exercise, divided by sessions and inspection tests. The registration questionnaire was open until the exercise day, which took place on May 2013. See the website of the activity 1 for more details on the questionnaire and training material.
After a common introductory session, participants inspected the same structures according to the CG or LG program. Every test took 15 min on average, which was actually faster than expected. First, all participants carried out an indoor pretest by looking at a poster. Then, the CG initiated inspections directly in the field without attending the learning session. The CG program was on different dates based on the participants' availability.
Instead, the LG continued their first-day with the learning session. The learning had an indoor program followed by an outdoor session that included the structures of the pretest. In the outdoor session, the LG was divided into teams with representatives of each participant's group (one technician, civilprotection volunteers and one student). At first, each participant compiled the test 1 for check dams and bridges. Then, they filled it out by teams for knowledge exchange. Senior technicians clarified further aspects, if needed. On the second day, the LG continued with test 2 and 3, divided in two groups to reach the structure location. At the end, all participants provided feedback and submitted the pictures they took during the inspections, if any.  synthesis of the inspection (1-2 and 4-5) * Includes multiple choice to specify the location of the problem.

Evaluation on the quality of collected data by volunteers
Tables 5 and 6 summarize results according to the component questions per parameter. Mean ordinal scores (X) and standard deviations (SD) were calculated from the ratings that participants reported in test 1, 2 and 3. Then, we evaluated how effectively data were collected by Vs as compared to that collected by Ts. Figures 2-6 summarize results according to the pretest, tests 1, 2 and 3. Distinction was done between Vs and Ts within participants of the learning and control groups. For that purpose, a frequency analysis was applied to the ordinal scores that Table 3 de-fines. This consideration was based upon the relatively low sample, size and difference in number between the groups. We referred to the mode score for the data evaluation as it represents the class with the highest frequency. In addition, we used the following criteria to assess the quality of collected data (EPA, 1997, p. 19-20): -Accuracy is a "degree of agreement between the data collected and the true value on the condition being measured". Then, we referred to as "true value" as the mode score for Ts in the learning group. Figures 2-6 aggregate the relative frequencies in four frequency classes with reference to the true value: equal to or larger than  90, 70-90 and 50-70 %, and smaller than 50 %. We chose this aggregation to distinguish different accuracy levels for each group. In addition, we assumed agreement among group members when a question had a relative frequency of at least 70 %. Then, the overall agree-ment per parameter was calculated by the ratio question / parameter, i.e., the number of questions with frequencies of at least 70 % between total questions per parameter. -Precision "refers to how well data collected are able to reproduce the result on the same group". For all participants, we represented precision by using the standard deviation (SD) in Tables 5 and 6. Instead, in Figs. 2-6, we compare each group while looking at the mode scores and the mode-off by one level. The mode-off is a range in precision given by generalizing the extreme scores of the rating classes. For example, according to Table 3, we generalize rating scales from five to three classes by grouping: very low to low concerns, medium concerns and high to very high concerns. Those are ordinal scores 1 and 2 on one side; scores 4 and 5 on the other. Mode-off by one level in Figs. 2-6 only distinguished questions where the scale generalization brought forth increments to the relative frequencies.
-Completeness is the "measure of the amount of valid data actually obtained vs. the amount expected to be obtained". In Tables 5 and 6, completeness is evaluated by the amount of answers obtained between the rating scales as compared to the selection of unspecified answers. In Figs. 2-6, we evaluated completeness by distinguishing questions with relative frequencies larger than 14 % in the options: "I don't know", "Could not be answered" and "No answer". We chose a threshold of 14 % to highlight questions with the lower complete-ness. It corresponds approximately to an absolute frequency of one participant in the control group or two participants in the learning group.
Other criteria, such as comparability and representativeness, were only considered in designing the form. "Comparability represents how well data from one form can be compared to data from another. Representativeness is the degree to which collected data actually represent the structure being inspected." (EPA, 1997, p. 19-20). Then, we referred to comparability by using a standard form for bridges and check dams. For representativeness, we required a photo record from the inspector to support their choices and to provide additional information for the later examination of inspections. Finally, we used comments provided by participants during the sessions and the comments provided in the feedback form to define corrective actions (Table 7). Following subparts present results according to specific aspects for bridges and check dams, then into common aspects for both structures. Table 5 shows that A and B parameters in test 1 have mean scores between 1 and 2 in the functional status. Lower ordinal scores represent the best condition for inspected aspects ( Fig. 3a). Despite Ts in the control group, Fig. 2 presents overall agreement near to one for parameter A. That is the ratio question / parameter for parameter A and test 1. For parameter B, overall agreement was reached only in the modeoff by one level. That represents lower precision in the B ratings indistinctly of the groups.

Functional status of bridges for A and B parameters
However, performance in test 1 contrasts with the one in test 2. Inspection complexity of bridge 2 was higher due to stream water flowing along the structure's pillars and abutments (Fig. 3b). For parameter A, Table 5 highlights higher frequency of unspecified answers and ordinal scores with standard deviations larger than 1. For Parameter A, Fig. 2 shows accuracy levels below the relative frequency of 70 %. Consequently, there is disagreement in the mode score between Vs and Ts, indistinctly of the groups. The presence of erosion in test 2, i.e., question A1, was mostly rated by participants as "Could not be answered", "No answer" or "No erosion". Moreover, those who reported erosion in the pillar and abutment did not distinguish among erosion presence with or without the stream water along the basis.
For parameter A, in test 3, Fig. 2 shows better performance for the TLG and VLG as compared to the TCG and VCG. The difference in performance could represent some influence of the learning session. However, it also denotes the need for adjusting questions to avoid misunderstandings.
That is the case of question A3, which should explicitly address the status of protection works for downstream scouring in bridges (Fig. 3c). For parameter B, Ts and Vs only reached accuracy levels above 70 % when looking at the mode-off by one level, indistinctly of the test. However, question B3 (presence of islands with shrubs or man-made structures that reduce the opening for the flow) had the lowest precision in Table 5 and Fig. 2. Then, question B3 should be split for better distinguishing presence of islands with vegetation from man-made obstructions. Table 6 highlights the questions with more serious concerns and standard deviations above 1. Despite the functional status, the presence of connected elements to the structures contributed to larger standard deviations. Thus, complexity in the inspection was higher due to the presence of a secondary structure in check dam 1 and a retention basin in check dam 2 ( Fig. 3d and e). Figure 4 shows the lower accuracy levels and overall agreement ratio for parameter A in test 3. Then, question A1 was the least accurate for Vs in test 2 and 3. Those results may be explained on the rating scale we used, (see Supplement for check dams). For question A1, rating classes did not distinguish slight deviations from the strong ones. Then, medium concerns were not explicitly within the available options.

Functional status of check dams for parameters A and B
In addition, Fig. 4 shows higher frequencies of unspecified answers for questions A2 and A3 in test 1 and 2. In test 1, unspecified answers were due to the water level at the basis of the structure. "Could not be answered" was even the preferred option for the TLG. In test 2, visibility at the basis of the structure was limited due to the sediment accumulation. Finally, question A4 denoted higher frequencies of unspecified answers for all structures (Table 6 and Fig. 4). Description of question A4 should be reviewed to avoid misunderstanding with question A2. That is the case of connected structures for protection of downstream scouring. Classes to report in question A4 should be extended to consider all possible functional conditions. For parameter B, questions B1 and B2 have the lowest completeness in pretest and test 1. Those questions were not relevant for the consolidation check dam. Despite the "Does not apply" option, VLG and TLG still preferred not to answer.

Common aspects for the functional status: parameter C and synthesis
In questions C1 and C2 for bridges and check dams, participants only distinguished the upstream and downstream location from the field inspection (test 1). That is comparing to the pretest results, which was the preliminary inspection test to the learning session. For the pretests, mode scores were only assessable during the field inspections (test 1) due to the participants' difficulty in compiling the form in front of a poster (pretest). Similar performance of the pretests holds for check dam 1 and bridge 1, indistinctly of the parameter inspected . When looking at the overall agreement, the most subjective aspects were the level of erosion at the stream bank (parameter C) and synthesis of the inspection. For the first one, the description coming along with the rating classes should be improved. The scheme that supports this aspect should also be adjusted in order to minimize misunderstandings between the left and right bank. For the latter, the synthesis should remain optional for volunteer inspectors. However, Ts did not agree for all structures. Then participants still require a short handout which can be taken with them to the field to support their choices.

Discussion
First-level inspection of hydraulic structures is a citizen science project that aims to support decisions about obstructions of hydraulic structures, or to pre-screen problems for more technical and detailed inspections. Potential use especially holds on dynamic environments with a large number of hydraulic structures or where financial and human resources are limited (Danielsen et al., 2005;Holub and Hübl, 2008;de Jong, 2013).
In this research, we combined rating scales with ordinal scores to get an indication on the functional status. However, we still acknowledge the usefulness of photo and videos to support detailed descriptions (Dirksen et al., 2013;Yetman, 2002). Regardless of their importance, not all volunteers took photographs. They expressed difficulties to relate the photo record in the form. Then, we made distinction among volunteer data with those obtained by professional staff. Referring to participants' characteristics, both the Ts and Vs were mostly adult men. However, Vs were younger participants, as groups included students. Ts generally had higher level of education than Vs. This can be explained again by the presence of students but also by the fact that volunteering for civil protection does not require high level of education. Most Ts and half of the Vs were local inhabitants of the Fella Basin. Vs who do not live there were mostly the students who live largely in the FVG region.
Compared with data collected by Ts, Vs often had higher variance. Differences in accuracy can be explained from their hazard experience and preliminary knowledge for the inspection of hydraulic structures. The experience with natural hazards varied greatly for all the groups, some participants had never experienced natural hazards when some had more than 10 times. Not surprisingly, Ts had very good a prior knowledge for debris-flow phenomena, functionality of check dams and culverts, as well as on emergency security guidelines. Vs scored lower especially in the VLG.
Survey procedures were discussed with the LG. Thus, forms may be differently interpreted when comparing between the TLG and TCG. Similarly, the preliminary knowledge of geosciences students may also influence their performance in the VCG. That is by comparing with the VLG, which were mostly composed of citizen volunteers of civil protection. Differences between Vs and Ts can evidenced lack of training and unfamiliarity with survey protocols.
We found that the use of rating scales with a range in precision of one level could cope with some variance. Previous studies indicated the advantages of rating classes (Yetman, 2002) and the combination of classes depending on the questions to be answered (Rinderer et al., 2012). However, visual inspections are subjected to various sources of biases, both for the volunteers and technicians. For example, limitations on the accuracy for the recognition of defects, precision to describe defects according to a rating scale and completeness of the inspection reports (Gouveia et al., 2004;Dirksen et al., 2013). Therefore, it was useful to include unspecified options distinguishing limitations such as water level and inspection conditions, which are also complementary information to analyze the reports.
Furthermore, we referred to potential improvements in survey procedures. Rating scales should consider all possible functional conditions, only when distinction among different concerns is possible. Improvements in the forms are still required. For example, a separated form could be defined for culverts to address specific aspects of such structures (Najafi and Bhattachar, 2011). However, surveys procedures should remain as simple as possible. Despite the needs for improvement, several comments proved the utility of the activity: "as a good initiative to instruct volunteers on the observation of the territory with preventive scope, it joined theory and practice together on the field, it helped to understand and inspect the functionality of the structure".
Nevertheless, iterative design and additional testing is required before making a perennial activity from this pilot. First, improvements on the inspection procedures should be separately tested with technicians to improve robustness of methods. Thus, we could validate the iterative design within the reference group. Then, we could discuss procedures with the technicians for later examination of data and the use for decision-making.
The involvement of a mixed group of participants was interesting for knowledge exchange, particularly in the outdoor session. However, it also facilitated interaction during the inspection tests, which was ideally not desired. Then, replication exercises should be carried out on a separate day for each participant's group to improve consistency on the meth-ods. Finally, participants should have feedback on the quality evaluation following every inspection campaign. It may contribute to maintaining data quality during the design phase but also on a long-term basis.
Moreover, citizen-based approaches require the effective combination of two practical aspects in order to be fully useful: recruiting and training strategies. Then, qualityassurance methods for data collection, comparison and examination (Crall et al., 2011;Riesch and Potter, 2014).
From the social perspective, a cornerstone beyond our research scope is the increasing volunteers' awareness of the water-sediment processes being addressed (Couvet et al., 2008). From the technical perspective, geoinformatic tools and mobile devices could facilitate data collection, access and validation (Newman et al., 2012). Data management systems could support technicians to compare and use collected data for later examination.
In addition, participants could exploit smartphone applications for form compiling, completeness checking, data transferring and photo record. Additional tools could be included, such as an embedded glossary or a systematic tag. GPS signal coverage of mobile devices is especially limited in mountain catchments. Therefore, a known ID of the structure may still be relevant. Future research using such applications should address data-quality requirements. Usability should explore advantages for the diversity of volunteers getting involved .

Conclusions
Results showed that citizen volunteers could carry out firstlevel inspections with comparable performance to technicians. Differences among the 11 technicians and 25 volunteers do not have a high statistical significance when distinction is done among control and learning groups. However, key points can still be extracted from this data set. Those considerations are relevant for the use of volunteers' data on the  Did you find the options When possible, rating scales with three or two classes To avoid misunderstandings, the question regarding provided in the form useful should be extended to rate all possible status. the presence of protection works will better refer to to answer the questions?
Presence of human infrastructure should be their length within the control distance. open question to report other infrastructure The form will emphasize to report the infrastructure besides roads and buildings.
that may be affected in case of high water levels.
Which aspects you did not All structures to inspect should have available Information regarding the type of structure like about the activity? information for the function type. will be always precompiled in the form.

Participants with technical background
The learning should start directly with considered the indoor session long while the outdoor session and finishes with citizen-volunteers requested more time the indoor session. to better understand the theory and The indoor session will be carried out to carry out the inspections.
separately for each group of participants. The inspection in front of the poster Interaction between groups will be limited could be better used after both theory to the outdoor session. and practice have been explained.
3. Unspecified answers may persist according to the complexity of connected elements to the structure, and the unexpected conditions for the inspection. Rating classes should specify when water or sediment did not allow the assessment. However, other options should be limited to facilitate the later examination of data.
4. Volunteer ratings should be considered as first-level assessment. Managers could combine these ratings to get indexes on the status of the structure at parameter level. However, an indication of the overall completeness per parameter would still be needed for the later examination.
5. The use of scores to convert volunteer ratings is important for getting an indication of the functional status. Since the rating scales are expressed in linguistic terms, ratings could be converted into numbers by using a fuzzy set theory instead of ordinal scores. Conversion of the volunteers' data using scales of fuzzy terms could handle the pre-required ranges in precision (e.g., from low to very low).
Important considerations to improve and promote citizen science projects are firstly related to limitations on citizen involvement due to different culture of volunteer activities and interest in participating. Secondly, training is relevant for the performance of volunteers, but also for increasing awareness and preparedness on the causes and consequences of hydro meteorological hazards (Enders, 2001). For the first one, students are an alternative approach for the citizens' recruitment, where there is limited volunteer culture. Universities could involve students of geosciences or social sciences to gain practical knowledge or better understand their territory (Savan et al., 2003).
For the latter, future research should test the effectiveness of the learning session according to differences in preliminary knowledge of participants. In the study area, volunteer groups offered opportunities to carry out first-level inspections. However, replication exercises are still needed in other study areas to improve the consistency and robustness of the data evaluation. In addition, survey procedures could be adapted to other target groups, e.g., last-year high school students who are not aware/involved in management activities. That could be an alternative approach to enhance their awareness of hydro-meteorological hazards.