I'm new to hive / hadoop and came to know that hive is schema on read. Which means you don't require to define schema while writing data into hdfs. I have a query here. Once mass data is built into hdfs, after that we need to apply some hive queries to read the data. And when reading, we should have defined schema already. So we come up with the design and build schema and put into metastore, create tables etc... Etc... So in future we are gonna write data again into the same table, database inside hdfs. Isnt it like NOW WE ARE WRITING THE DATA AGAINST SCHEMA? so, will it still hold to be schema on read? Or schema on write also????? Pls add points if I'm missing something here. Note : 1.Basically, first time I didn't use hive ql to store data, just copy paste into hdfs. 2. future ly I'm stuffing data into hdfs using hive ql... P. S : I'm not getting answer to these on Google, sof or quora.. Plz help :)
Op any idea on how to set connection pooling
Not sure, but here's my two cents. With hive (schema on read), it means it'll allow you to write whatever you want, without checking against the schema. But when you try to read that data, that is when you'll see that the data got corrupted on entry. Does this help?
It cannot be called schema on write as hdfs is just a file-based storage and you can write whatever you want to. Yes, writing data into hdfs using hive ql is an accepted industry standard as long as you dont screw up existing schema
Tech Industry
Yesterday
423
Snap and the TikTok ban
India
Yesterday
218
Duniya me Vishwaguru ka Danka
Working Parents
Yesterday
2808
Is it true many Indian couples are in sexless marriage?
Personal Finance
Yesterday
3380
Is it cheaper to eat chipotle everyday than buy groceries ?
Health & Wellness
Yesterday
1471
Quitting Sugar