A Prototyped NL-Based Approach for the Design of Multidimensional Data Warehouse

Authors

  • Abeer Alzahrani College of Computer Science & Engineering University of Jeddah, Jeddah, Saudi Arabia
  • Mohamed Alqarni College of Computer Science & Engineering University of Jeddah, Jeddah, Saudi Arabia
  • Jamel Feki University of Jeddah

DOI:

https://doi.org/10.24203/ijcit.v9i5.39

Keywords:

Data Warehouse, Multidimensional schema, NL-templates, Decisional requirements

Abstract

Organizations are more and more interested in the Data Warehouse (DW) technology and data analytics to base their decision-making processes on scientific arguments instead of intuition. Despite the efforts invested, the DW design issue remains a great challenging research domain. The design quality of the DW depends on several aspects, as the requirement gathering. In this context, we propose a Natural Language (NL) based design approach, which is twofold, first, it facilitates the involvement of the decision-makers in the DW design process; indeed, NL can encourage the decision-makers to express their requirements as English-like sentences conform to NL-templates. Secondly, our approach aims to generate semi-automatically a DW schema from a set of requirements gathered as analytical queries compliant to the NL-templates. This design approach relies on (i) two easy-to-use NL-templates to specifying the analysis components, and (ii) a set of five heuristic rules for extracting the multidimensional concepts from the requirements. We demonstrate the feasibility of our approach by developing the prototype Natural Language Decisional Requirements to DW Schema (NLDR2DWS).

References

M. Rosemann and J. vom Brocke, “The six core elements of business process management,” in Handbook on business process management 1, Springer, 2015, pp. 105–122.

B. Husemann, J. Lechtenborger, and G. Vossen, “Conceptual data warehouse design,[w:] Proceedings of the International Workshop on Design and Management of Data Warehouses,” Stock. Sweden, June, pp. 5–6, 2000.

R. Kimball and M. Ross, The data warehouse toolkit: the complete guide to dimensional modeling. John Wiley & Sons, 2011.

M. Golfarelli, D. Maio, and S. Rizzi, “The dimensional fact model: A conceptual model for data warehouses,” Int. J. Coop. Inf. Syst., vol. 7, no. 02n03, pp. 215–247, 1998.

A. Nabli, A. Soussi, J. Feki, H. Ben Abdallah, and F. Gargouri, “Towards an automatic data mart design,” Dimension, vol. 1, p. V1, 2005. https://www.researchgate.net/profile/J_Feki/publication/220708768_Towards_an_Automatic_Data_Mart_Design/links/54cccfd70cf298d6565b136e.pdf

Y. Hachaichi and J. Feki, “An automatic method for the design of multidimensional schemas from object oriented databases,” Int. J. Inf. Technol. Decis. Mak., vol. 12, no. 06, pp. 1223–1259, 2013. https://www.researchgate.net/profile/J_Feki/publication/262874868_An_automatic_method_for_the_design_of_multidimensional_schemas_from_object_oriented_databases/links/53cf76790cf25dc05cfaf3f9/An-automatic-method-for-the-design-of-multidimensional-schemas-from-object-oriented-databases.pdf

W. H. Inmon, Building the data warehouse. John wiley & sons, 2005.

J. Smith and M. Rege, “The Data Warehousing (R) Evolution: Where’s it headed next?,” in Proceedings of the International Conference on Compute and Data Analysis, 2017, pp. 104–108.

P. Giorgini, S. Rizzi, and M. Garzetti, “GRAnD: A goal-oriented approach to requirement analysis in data warehouses,” Decis. Support Syst., vol. 45, no. 1, pp. 4–21, 2008.

F. Bargui, H. Ben-Abdallah, and J. Feki, “A hybrid approach for data mart schema design from NL-OLAP requirements,” in International Conference on Application of Natural Language to Information Systems, 2009, pp. 295–296.

F. Bargui, H. Ben-Abdallah, and J. Feki, “Enhancing the involvement of decision makers in data mart design,” Int. J. Data Anal. Tech. Strateg., vol. 11, no. 2, pp. 148–175, 2019.

E. Elamin, S. Alshomrani, and J. Feki, “SSReq: A method for designing Star Schemas from decisional requirements,” in 2017 International Conference on Communication, Control, Computing and Electronics Engineering (ICCCCEE), 2017, pp. 1–7.

F. Bargui, H. Ben-Abdallah, and J. Feki, “Multidimensional concept extraction and validation from OLAP requirements in NL,” in 2009 International Conference on Natural Language Processing and Knowledge Engineering, 2009, pp. 1–8.

F. Bargui, J. Feki, and H. Ben-Abdallah, “A natural language approach for data mart schema design,” 9th Int. ACIT, Tunis., 2008.

M. A. Guessoum, R. Djiroun, and K. Boukhalfa, “Towards Decisional Natural Language Why-Question Recommendation Approach in Business Intelligence Context,” in 2019 International Conference on Networking and Advanced Systems (ICNAS), 2019, pp. 1–6.

R. Lumbantoruan, E. M. Sibarani, M. V. Sitorus, A. Mindari, and S. P. Sinaga, “An Approach for Automatically Generating Star Schema from Natural Language,” Telkomnika, vol. 12, no. 2, p. 501, 2014.

M. Thenmozhi and K. Vivekanandan, “A tool for data warehouse multidimensional schema design using ontology,” Int. J. Comput. Sci. Issues, vol. 10, no. 2, p. 161, 2013.

E. M. Leonard, “Design and implementation of an enterprise data warehouse,” 2011.

M. A. Naeem and I. S. Bajwa, “Generating OLAP queries from natural language specification,” in Proceedings of the International Conference on Advances in Computing, Communications and Informatics, 2012, pp. 768–773.

N. El Moukhi, I. El Azami, A. Mouloudi, and A. ElMounadi, “Requirements-based approach for multidimensional design,” Procedia Comput. Sci., vol. 148, pp. 333–342, 2019.

M. A. Naeem, S. Ullah, and I. S. Bajwa, “Interacting with data warehouse by using a natural language interface,” in International Conference on Application of Natural Language to Information Systems, 2012, pp. 372–377.

C. A. Hurtado and A. O. Mendelzon, “OLAP dimension constraints,” in Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, 2002, pp. 169–179. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.135.8570&rep=rep1&type=pdf

C. A. Hurtado, C. Gutierrez, and A. O. Mendelzon, “Capturing summarizability with integrity constraints in OLAP,” ACM Trans. Database Syst., vol. 30, no. 3, pp. 854–886, 2005.

B.-M. I. Feki J., “NL-PI: A natural language tool for the reuse of multidimensional patterns,” in International Arab Conference on Information Technology (ACIT’09), Sana’a, Yemen, December 2009.

A. Abelló, J. Samos, and F. Saltor, “YAM2: a multidimensional conceptual model extending UML,” Inf. Syst., vol. 31, no. 6, pp. 541–567, 2006.

J. Feki and H. Ben-Abdallah, “Multidimensional pattern construction and logical reuse for the design of data marts,” Int. Rev. Comput. Softw., vol. 2, no. 2, pp. 124–134, 2007.

M. Ben Abdallah, J. Feki, and H. Ben-Abdallah, “Patrons multidimensionnels constraints,” in SIIE’08 Conférence Internationale des Systèmes d’Information et Intelligence Economique. Tunisia, 2008, pp. 14–16.

B. Angela, F. Cattaneo, S. Ceri, A. Fuggetta, and S. ParaBoschi, “Designing Data Marts for Data Warehouse,” J. ACM Trans. Softw. Eng. Methodol., vol. 10, no. 4, pp. 452–483, 2001.

A. Alzahrani and J. Feki, “Toward a Natural Language-Based Approach for the Specification of Decisional-Users Requirements,” in 2020 3rd International Conference on Computer Applications & Information Security (ICCAIS), 2020, pp. 1–6.

H. L. H. S. Warnars and R. Randriatoamanana, “Datawarehouser: A Data Warehouse artist who have ability to understand data warehouse schema pictures,” in 2016 IEEE Region 10 Conference (TENCON), 2016, pp. 2205–2208.

H. Talbot, “wxPython, a GUI Toolkit,” Linux J., vol. 2000, no. 74es, p. 5, 2000.

G. A. Miller, “WordNet: a lexical database for English,” Commun. ACM, vol. 38, no. 11, pp. 39–41, 1995.

J. Perkins, Python text processing with NLTK 2.0 cookbook. Packt Publishing Ltd, 2010.

S. T. Bhosale, T. Patil, and P. Patil, “Sqlite: Light database system,” Int. J. Comput. Sci. Mob. Comput., vol. 4, no. 4, p. 882, 2015.

J.-N. Mazón and J. Trujillo, “An MDA approach for the development of data warehouses,” Decis. Support Syst., vol. 45, no. 1, pp. 41–58, 2008.

O. M. G. MDA, “Object Management Group Model Driven Architecture.” 2008.

O. M. G. QVT, “Object Management Group: Meta Object Facility (MOF) 2.0 Query/View-/Transformation, v1. 1.” Standard, 2011.

Published

2020-09-30

How to Cite

Alzahrani, A., Alqarni, M., & Feki , J. . (2020). A Prototyped NL-Based Approach for the Design of Multidimensional Data Warehouse . International Journal of Computer and Information Technology(2279-0764), 9(5). https://doi.org/10.24203/ijcit.v9i5.39