簡易檢索 / 詳目顯示

研究生: 希則
Ordonez, Cesar
論文名稱: Starting Up the Big Data Engine: Sparking Data Analytic Thinking Through Data Extraction and Exploration in Startups
啟動大數據引擎: 績優資料提取和新創公司的發展激起數據分析思維
指導教授: 雷松亞
Ray, Soumya
口試委員: 許裴舫
Hsu, Pei Fang
徐茉莉
Shmueli, Galit
學位類別: 碩士
Master
系所名稱: 科技管理學院 - 國際專業管理碩士班
International Master of Business Administration(IMBA)
論文出版年: 2016
畢業學年度: 104
語文別: 英文
論文頁數: 44
中文關鍵詞: 新創公司ETLJSON數據分析Principal Component AnalysisHierarchical Clustering
外文關鍵詞: Startups, ETL, JSON, Data Analytics, Principal Component Analysis, Hierarchical Clustering
相關次數: 點閱:1下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • Data analytics in startups is usually a task delayed to later stages of the product, which is plausible considering startups are focused on constantly delivering a product and may have little or no data to analysis. Although the startup may use a commercial data analytics framework, sooner or later, the data captured by the startup itself becomes a valuable source of insight for strategy and decision-making. To spark a successful data analytics initiative, two components are required. The first component is a motivated and collaborative startup team. Startup management should be motivated by the ease of an exploratory analysis and its results, as well as the reduced amount of work required from them to engage in this task. The second component is technical but cannot occur without the first because, although it requires skills, the analysis team needs to ensure collaboration and support from the startup team to answer questions regarding the company and the data. This initial analytics project yielded useful results for a startup who wants to see who are its main users based on their own data, but segmentation is just one of the many possibilities of data analytics. By focusing on developing flexible code built for change through frameworks like extraction-transformation-loading, a foundation has been created for further data analytics projects such as predictive analytics.


    Data analytics in startups is usually a task delayed to later stages of the product, which is plausible considering startups are focused on constantly delivering a product and may have little or no data to analysis. Although the startup may use a commercial data analytics framework, sooner or later, the data captured by the startup itself becomes a valuable source of insight for strategy and decision-making. To spark a successful data analytics initiative, two components are required. The first component is a motivated and collaborative startup team. Startup management should be motivated by the ease of an exploratory analysis and its results, as well as the reduced amount of work required from them to engage in this task. The second component is technical but cannot occur without the first because, although it requires skills, the analysis team needs to ensure collaboration and support from the startup team to answer questions regarding the company and the data. This initial analytics project yielded useful results for a startup who wants to see who are its main users based on their own data, but segmentation is just one of the many possibilities of data analytics. By focusing on developing flexible code built for change through frameworks like extraction-transformation-loading, a foundation has been created for further data analytics projects such as predictive analytics.

    Table of Contents Abstract 2 Acknowledgement 3 1. Introduction 7 1.1 PicCollage 8 2. The Status of Data Analysis in Startups 8 2.1 Emphasis on Minimum Viable Product 8 2.2 Agile & DevOps 10 2.3 Dependence on Ready Made Solutions 10 2.4 The Startups’ Data 13 2.4.1 No Data 13 2.4.3 Data in Non-Relational Format 13 2.4.2 Data Tasks to the Back Burner 14 3. Sparking Data Analytics in Startups 14 3.1 Finding a Champion 15 3.2 Removing the Data Extraction Burden 16 3.3 Built-for-Change 16 3.4 Finding the Low-Hanging Fruit 17 3.5 Adapting to the Startup´s Workflow 17 4. Methodology 19 4.1 Engaging the Product Manager 19 4.2 Agile Extraction-Transformation-Loading 19 4.2.1 Limitations of Third-Party Data 19 4.2.2 API Access to Data 20 4.2.3 SQLite, Sequel, and Migration Files 21 4.2.4 Kiba and ETL 22 4.3 First Objective: User Segmentation 25 4.3.1 Collage Structures 25 4.3.2 Collage Structure Dimensions 26 4.3.3 PCA for Users 28 4.3.4 Hierarchical Clustering 30 4.4 Adapting to Startup Workflow 34 4.4.1 Trello 34 4.4.3 Ruby 36 5. Outcome and Challenges 37 5.1 Results 37 5.2 Challenges 38 5.2.1 Becoming a domain expert 38 5.2.2 Data is not perfect 38 5.2.3 Extracting data from mobile analytic frameworks 38 5.2.4 Creating diagrams and dictionaries 38 6. Further Work & Conclusion 39 6.1 Integrating ETL into the Service Architecture 39 6.2 Predictive Analytics 40 6.4 Final Remarks 40

    Ambler, S.W. (2014). Examining the agile manifesto. Retrieved from http://www.ambysoft.com/essays/agileManifesto.html
    Baskerville, R., Myers, M.D. (2004). Special issue on action research in information systems: making is research relevant to practice: foreword. MIS Quarterly. 28(3), 329-335.
    Beath, C.M. (1991). Supporting the information technology champion. MIS Quarterly. 15(3), 355-372.
    Blank, S. (2010, March 4). Perfection by substraction: The minimum feature set. Retrieved from https://steveblank.com/2010/03/04/perfection-by-subtraction-the-minimum-feature-set/.
    Davenport, T. H. (2006). Competing on Analytics. Harvard Business Review. Retrieved from https://hbr.org/2006/01/competing-on-analytics.
    European Computer Manufacturing Association. (2013, October). The json data interchange format. Retrieved from http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-404.pdf.
    Fankhouser, D. (2013, Aug 8). What type of user data should you collect? Retrieved from http://mashable.com/2013/08/08/user-data/?utm_cid=mash-com-fb-main-link#XCZbIgP6_8qd
    First Round Review. (n.d.). Your data is your life blood- set up the analytics it deserves. Retrieved from http://firstround.com/review/your-data-is-your-lifeblood-set-up-the-analytics-it-deserves/.
    Gebauer, J., Schober, F. (2015). Information system flexibility and the performance of business process (Working Paper No. 05-112). Retrieved from College of Business at Illinois https://business.illinois.edu/working_papers/papers/05-0112.pdf
    Hipp, Wyrick & Co. (n.d.). About SQLite. Retrieved from http://www.sqlite.org/about.
    Hipp, Wyrick & Co. (n.d.). Limits in SQLite. Retrieved from http://www.sqlite.org/limits.html.
    Kimball, R. (1996). The data warehouse toolkit: Practical techniques for building dimensional data warehouses. New York, NY: John Wiley & Sons, Inc.
    Kirsch, K. (2006). Finding a change champion. Journal of Digital Asset Management. 2(5), 237-241.
    Knupp, J. (2014, Apr 15). How devops is killing the developer. Retrieved from https://jeffknupp.com/blog/2014/04/15/how-devops-is-killing-the-developer/
    Loukides, M. (2012, June 7). What is devops? What we mean by ‘operations’ and how it’s changed over the years. Retrieved from http://radar.oreilly.com/2012/06/what-is-devops.html.
    Mathur, A., Cao, M., Bhattacharya, S., Dilger, A., Tomas, A., & Vivier, L. (2007, June 27). The new ext4 Filesystem: Current status and future plans. Ottawa Linux Symposium. Retrieved from https://www.kernel.org/doc/ols/2007/ols2007v2-pages-21-34.pdf.
    Mullin, S. (2013). The beginner’s guide to startup analytics. Retrieved from https://blog.kissmetrics.com/startup-analytics/
    MVP: Minimum viable product. (n.d.). Retrieved from http://www.syncdev.com/minimum-viable-product/.
    Parikh, R. (2014, May 25). The single most useful analytics strategy for early-stage startups. Retrieved from: http://data.heapanalytics.com/the-single-most-useful-analytics-strategy-for-early-stage-startups
    Pinto, J.K., Slevin, D.P. (1987). Critical factors in successful project implementation. IEEE Transactions On Engineering Management. EM34(1), 22-27.
    Ries, E. (2011). The lean startup: How today’s entrepreneurs use continuous innovation to create radically successful business. New York, NY: Crown Business.
    Shmueli, G., Bruce, P.C., Patel, N.R. (2016). Data mining for business analytics: Concepts, techniques, and applications with XLMiner. Hoboken, NJ: John Wiley & Sons, Inc.
    Short, K. R. (2014, February 25). Keep the monkey off your back! Problem solving analysis. Retrieved from https://www.batimes.com/articles/keep-the-monkey-off-your-back-problem-solving-analysis.html
    Sumner, M. (1999). Proceedings from SIGCPR ’99: ACM SIGCPR conference on Computer personnel research. 297-303. New York, NY: ACM.
    Stevens, J. P. (1992). Applied multivariate statistics for the social sciences (2nd edition). Hillsdale, NJ: Erlbaum.
    Top 5 reasons why mobile analytics is important. (2012, September 4). Retrieved from https://www.mobilesmith.com/top-5-reasons-for-mobile-app-analytics/
    Trello (2016, April 4). Adding your first board. Retrieved from http://help.trello.com/article/719-adding-your-first-board.
    Wickham, H., James, D.A., Falcon, S. (2015, February 19). Package ‘RSQLite’. Retrieved from https://cran.r-project.org/web/packages/RSQLite/RSQLite.pdf
    Yahoo Developer Network. (n.d.). Adopt the industry standard for free. Retrieved from https://developer.yahoo.com/analytics/.
    Yahoo Developer Network. (n.d.). Flurry Documentation: Cohorts. Retrieved from https://developer.yahoo.com/flurry/docs/analytics/explorer/cohorts/.
    Yahoo Developer Network. (n.d.). Flurry Documentation: Events best practices. https://developer.yahoo.com/flurry/docs/analytics/gettingstarted/events/ios/.
    Yahoo Developer Network. (n.d.). Flurry Documentation: Explorer Overview. Retrieved from https://developer.yahoo.com/flurry/docs/analytics/explorer/getstarted/.
    Yahoo Developer Network. (n.d.). Flurry Documentation: Funnels. Retrieved from https://developer.yahoo.com/flurry/docs/analytics/explorer/funnels/.
    Yahoo Developer Network. (n.d.). Flurry Documentation: Get started with flurry analytics for ios. Retrieved from https://developer.yahoo.com/flurry/docs/analytics/gettingstarted/ios/#tab=1.
    Yahoo Developer Network. (n.d.). Flurry Documentation: Introducing new analytics tools. Retrieved from https://developer.yahoo.com/flurry/docs/analytics/.
    Yahoo Developer Network. (n.d.). Flurry Documentation: Segments. Retrieved from https://developer.yahoo.com/flurry/docs/analytics/explorer/segments/.
    Zazueta, R. (2014, January 23). API data exchange: xml vs. json. Retrieved from http://www.mashery.com/blog/api-data-exchange-xml-vs-json.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE