Knowledge Discovery (KD) is a process that involves data preparation, mining and post-processing. Its purpose is to extract information from data patterns that are interesting and relevant to the user, but sometimes are hidden in huge databases . The data preparation phase is important because data are rarely collected with KD purposes. The mining phase is the one that actually finds patterns in the data, ensuring that they correspond to user demands. Finally, the post-processing phase transforms and presents the extracted patterns.
There are three important problems in the context of data mining that still require much research: data uncertainty , temporal evolution [6, 101], and scalability of techniques and algorithms [8, 91, 104]. Because of its decentralized and emergent nature, the quality of information available on the Web is associated with high uncertainty, since there is no systematic way of validating or controlling the data. Temporal evolution is a phenomenon in all the areas of knowledge, including the Web. The challenge here is to understand, model and explore temporal information to improve the KD process. The challenge in dealing with scalability goes beyond the one previously stated, since both the amount of data and the complexity of patterns increase.
This research area investigates algorithms and KD techniques that deal with the three challenges presented earlier. In special, it considers the use of soft computing algorithms , given their ability to handle uncertainty, imprecision and partial truth. There is also interest on the design of characterization models and data mining techniques that take into account temporal information, and on the development of scalable algorithms for mining complex patterns.
The algorithms and techniques studied in this line of research will contribute to solve problems described in Challenges 1, 2 and 3. Considering Challenge 1, they will be directly involved in goals 1.1 though 1.4. As to Challenge 2, they will be useful to reach objectives 2.5 and 2.6. Concerning Challenge 3, the algorithms and techniques will use results provided by goal 3.5 and contribute to objectives 3.1 and 3.6.
The Knowledge Discovery (KD) research line will be carried out, primarily, by the researchers Wagner Meira (UFMG), Marcos Gonçalves (UFMG), Gisele L. Pappa (UFMG), and Adriano Pereira (UFMG).