EzDevInfo.com

replication interview questions

Top replication frequently asked interview questions

What are the scenarios for using mirroring, log shipping, replication and clustering in SQL Server

As far as i know SQL Server provides 4 techniques for better availability.

I think these are the primary usage scenarios, in summary :-

1) Replication would be primarily suited for online-offline data synchronization scenarios (laptop , mobile devices, remote servers).

2) Log shipping could be used to have a failover server with manual switching, whereas

3) Database Mirroring is an automatic failover technique

4) Failover Clustering is an advanced type of database mirroring.

Am i right ?

Thanks.


Source: (StackOverflow)

I get a "An attempt was made to load a program with an incorrect format" error on a SQL Server replication project

The exact error is as follows

Could not load file or assembly 'Microsoft.SqlServer.Replication, Version=9.0.242.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91' or one of its dependencies. An attempt was made to load a program with an incorrect format.

I've recently started working on this project again after a two month move to another project. It worked perfectly before, and I've double checked all the references.


Source: (StackOverflow)

Advertisements

How can I slow down a MySQL dump as to not affect current load on the server?

While doing a MySQL dump is easy enough, I have a live dedicated MySQL server that I am wanting to setup replication on. To do this, I need dumps of the databases to import to my replication slave.

The issue comes when I do the dumps, MySQL goes full force at it and ties up resources to the sites that connecting to it. I am wondering if there is a way to limit the dump queries to a low priority state to which preference is given to live connections? The idea being that the load from external sites is not affected by the effort of MySQL to do a full dump...


Source: (StackOverflow)

Solr indexing - Master/Slave replication, how to handle huge index and high traffic?

I'm currently facing an issue with SOLR (more exactly with the slaves replication) and after having spent quite a few time reading online I find myself having to ask for some enlightenment.

- Does Solr have some limitation in size for its index?

When dealing with a single master, when is it the right moment to decide to use multi cores or multi indexes? Is there any indications on when reaching a certain size of index, partitioning is recommended?

- Is there any max size when replicating segments from master to slave?

When replicating, is there a segment size limit when the slave won't be able to download the content and index it? What is the threshold to which a slave won't be able to replicate when there's a lot of traffic to retrieve info and lots of new documents to replicate.

To be more factual, here is the context that led me to these questions: We want to index a fair amount of documents, but when the amount reaches more than a dozen millions, the slaves can't handle it and start failing replicating with a SnapPull error. The documents are composed with a few text fields (name, type, description, ... about 10 other fields of let's say 20 characters max).

We have one master, and 2 slaves which replicate data from the master.

This is my first time working with Solr (I work usually on webapps using spring, hibernate... but no use of Solr), so I'm not sure how to tackle this issue.

Our idea is for the moment to add multiple cores to the master, and have a slave replicating from each of this core. Is it the right way to go?

If it is, how can we determine the number of cores needed? Right now we're just going to try and see how it behaves and adjust if necessary, but I was wondering if there was any best practices or some benchmarks that have been done on this specific topic.

For this amount of documents with this average size, x cores or indexes are needed ...

Thanks for any help in how I could deal with a huge amount of documents of average size!

Here is a copy of the error that is being thrown when a slave is trying to replicate:

ERROR [org.apache.solr.handler.ReplicationHandler] - <SnapPull failed >
org.apache.solr.common.SolrException: Index fetch failed :
        at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:329)
        at org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:264)
        at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417)
        at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:280)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:135)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:65)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:142)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:166)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
        at java.lang.Thread.run(Thread.java:595)
Caused by: java.lang.RuntimeException: java.io.IOException: read past EOF
        at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1068)
        at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:418)
        at org.apache.solr.handler.SnapPuller.doCommit(SnapPuller.java:467)
        at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:319)
        ... 11 more
Caused by: java.io.IOException: read past EOF
        at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:151)
        at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:38)
        at org.apache.lucene.store.IndexInput.readInt(IndexInput.java:70)
        at org.apache.lucene.index.SegmentInfos$2.doBody(SegmentInfos.java:410)
        at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:704)
        at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:538)
        at org.apache.lucene.index.SegmentInfos.readCurrentVersion(SegmentInfos.java:402)
        at org.apache.lucene.index.DirectoryReader.isCurrent(DirectoryReader.java:791)
        at org.apache.lucene.index.DirectoryReader.doReopen(DirectoryReader.java:404)
        at org.apache.lucene.index.DirectoryReader.reopen(DirectoryReader.java:352)
        at org.apache.solr.search.SolrIndexReader.reopen(SolrIndexReader.java:413)
        at org.apache.solr.search.SolrIndexReader.reopen(SolrIndexReader.java:424)
        at org.apache.solr.search.SolrIndexReader.reopen(SolrIndexReader.java:35)
        at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1049)
        ... 14 more

EDIT: After Mauricio's answer, the solr libraries have been updated to 1.4.1 but this error was still raised. I increased the commitReserveDuration and even if the "SnapPull Failed" error seems to have disappeared, another one started being raised, not sure about why as I can't seem to find much answer on the web:

ERROR [org.apache.solr.servlet.SolrDispatchFilter] - <ClientAbortException:  java.io.IOException
        at org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:370)
        at org.apache.tomcat.util.buf.ByteChunk.append(ByteChunk.java:323)
        at org.apache.catalina.connector.OutputBuffer.writeBytes(OutputBuffer.java:396)
        at org.apache.catalina.connector.OutputBuffer.write(OutputBuffer.java:385)
        at org.apache.catalina.connector.CoyoteOutputStream.write(CoyoteOutputStream.java:89)
        at org.apache.solr.common.util.FastOutputStream.flushBuffer(FastOutputStream.java:183)
        at org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:89)
        at org.apache.solr.request.BinaryResponseWriter.write(BinaryResponseWriter.java:48)
        at org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:322)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:254)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
        at org.jstripe.tomcat.probe.Tomcat55AgentValve.invoke(Tomcat55AgentValve.java:20)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174)
        at org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:837)
        at org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:640)
        at org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1286)
        at java.lang.Thread.run(Thread.java:595)
Caused by: java.io.IOException
        at org.apache.coyote.http11.InternalAprOutputBuffer.flushBuffer(InternalAprOutputBuffer.java:703)
        at org.apache.coyote.http11.InternalAprOutputBuffer$SocketOutputBuffer.doWrite(InternalAprOutputBuffer.java:733)
        at org.apache.coyote.http11.filters.ChunkedOutputFilter.doWrite(ChunkedOutputFilter.java:124)
        at org.apache.coyote.http11.InternalAprOutputBuffer.doWrite(InternalAprOutputBuffer.java:539)
        at org.apache.coyote.Response.doWrite(Response.java:560)
        at org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:365)
        ... 22 more
>
ERROR [org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/].[SolrServer]] - <Servlet.service() for servlet SolrServer threw exception>
java.lang.IllegalStateException
        at org.apache.catalina.connector.ResponseFacade.sendError(ResponseFacade.java:405)
        at org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:362)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:272)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
        at org.jstripe.tomcat.probe.Tomcat55AgentValve.invoke(Tomcat55AgentValve.java:20)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174)
        at org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:837)
        at org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:640)
        at org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1286)
        at java.lang.Thread.run(Thread.java:595)

I still wonder what are the best practices to handle a big index (more than 20G) containing a lot of documents with solr. Am I missing some obvious links somewhere? Tutorials, documentations?


Source: (StackOverflow)

Scaling solutions for MySQL (Replication, Clustering)

At the startup I'm working at we are now considering scaling solutions for our database. Things get somewhat confusing (for me at least) with MySQL, which has the MySQL cluster, replication and MySQL cluster replication (from ver. 5.1.6), which is an asynchronous version of the MySQL cluster. The MySQL manual explains some of the differences in its cluster FAQ, but it is hard to ascertain from it when to use one or the other.

I would appreciate any advice from people who are familiar with the differences between those solutions and what are the pros and cons, and when do you recommend to use each.


Source: (StackOverflow)

Can you index tables differently on Master and Slave (MySQL)

Is it possible to set up different indexing on a read only slave, from on the master? Basically, this seems like it makes sense given the different requirements of the two systems, but I want to make sure it will work and not cause any problems.


Source: (StackOverflow)

Has anyone figured out how to scale Amazon RDS read replicas?

I've recently set up a read replica to take some of the read load off of my Amazon multi-AZ RDS instance. The Amazon documentation clearly states that it is "up to your application to determine how read traffic is distributed across your read replicas".

Has anyone figured out a manageable way to scale read replicas? It doesn't seem like a very extensible solution to have different parts of my application hard-coded to read from specific replicas. Is there a way to set this up that is analogous to putting EC2 instances behind a load balancer?


Source: (StackOverflow)

INSERT ... ON DUPLICATE KEY UPDATE with WHERE?

I'm doing a INSERT ... ON DUPLICATE KEY UPDATE but I need the update part to be conditional, only doing the update if some extra condition has changed.

However, WHERE is not allowed on this UPDATE. Is there any workaround for this?

I can't do combinations of INSERT/UPDATE/SELECT since this needs to work over a replication.


Source: (StackOverflow)

JVM heap replication between two machines

What are the basic principles of how two separable computers connected within the same network running the same Java application maintain the same state by syncing their heap between each other?

I believe Terracotta does this task but I have no idea how would some pseudo code look like that would describe its core functions.

I'm just looking for understanding of this technology.


Source: (StackOverflow)

Error: "could not initailize master info structure" while doing Master Slave Replication in MySQL

I am trying to do Master Slave Replication for MySQL. When i am typing the following command:

CHANGE MASTER TO MASTER_HOST='10.1.100.1', MASTER_USER='slave_user', MASTER_PASSWORD='slave_password', MASTER_LOG_FILE='mysql-bin.000001', MASTER_LOG_POS=451228;
mysql> START SLAVE;

it throws the following error:

ERROR 1201 (HY000): Could not initialize master info structure; more error messages can be found in the MySQL error log

Any help would be greatly appreciated.


Source: (StackOverflow)

MySQL: Very slow update/insert/delete queries hanging on "query end" step

I have a large and heavy loaded mysql database which performs quite fast at times, but some times get terribly slow. All tables are InnoDB, server has 32GB of RAM and database size is about 40GB.

Top 20 queries in my slow_query_log are update, insert and delete queries and I cannot understand why they are so slow (up to 120 seconds sometimes!)

Here is the most frequent query:

UPDATE comment_fallows set comment_cnt_new = 0 WHERE user_id = 1;

Profiling results:

mysql> set profiling = 1;
Query OK, 0 rows affected (0.00 sec)

mysql> update comment_fallows set comment_cnt_new = 0 where user_id = 1;
Query OK, 0 rows affected (2.77 sec)
Rows matched: 18  Changed: 0  Warnings: 0

mysql> show profile for query 1;
+---------------------------+----------+
| Status                    | Duration |
+---------------------------+----------+
| starting                  | 0.000021 |
| checking permissions      | 0.000004 |
| Opening tables            | 0.000010 |
| System lock               | 0.000004 |
| init                      | 0.000041 |
| Searching rows for update | 0.000084 |
| Updating                  | 0.000055 |
| end                       | 0.000010 |
| query end                 | 2.766245 |
| closing tables            | 0.000007 |
| freeing items             | 0.000013 |
| logging slow query        | 0.000003 |
| cleaning up               | 0.000002 |
+---------------------------+----------+
13 rows in set (0.00 sec)

I am using master/server replication, so the binary log is enabled. I've fallowed one advice I've found on the internet and set flush_log_at_trx_commit to 0 but it did not make any difference:

mysql> show variables like '%trx%';
+-------------------------------------------+-------+
| Variable_name                             | Value |
+-------------------------------------------+-------+
| innodb_flush_log_at_trx_commit            | 0     |
| innodb_use_global_flush_log_at_trx_commit | ON    |
+-------------------------------------------+-------+

The table structure:

CREATE TABLE `comment_fallows` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `user_id` int(11) NOT NULL,
  `part_id` int(11) DEFAULT NULL,
  `article_id` int(11) DEFAULT NULL,
  `request_id` int(11) DEFAULT NULL,
  `comment_cnt` int(10) unsigned NOT NULL,
  `comment_cnt_new` int(10) unsigned NOT NULL DEFAULT '0',
  `last_comment_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`),
  KEY `user_id` (`user_id`,`last_comment_date`),
  KEY `part_id` (`part_id`),
  KEY `last_comment_date` (`last_comment_date`),
  KEY `request_id` (`request_id`),
  CONSTRAINT `comment_fallows_ibfk_1` FOREIGN KEY (`user_id`) REFERENCES `users` (`id`) ON DELETE CASCADE,
  CONSTRAINT `comment_fallows_ibfk_2` FOREIGN KEY (`part_id`) REFERENCES `fanfic_parts` (`id`) ON DELETE CASCADE,
  CONSTRAINT `comment_fallows_ibfk_3` FOREIGN KEY (`request_id`) REFERENCES `requests` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=2239419 DEFAULT CHARSET=utf8

And all the innodb settings (server has 32 GB of RAM):

mysql> show variables like '%innodb%';
+-------------------------------------------+------------------------+
| Variable_name                             | Value                  |
+-------------------------------------------+------------------------+
| have_innodb                               | YES                    |
| ignore_builtin_innodb                     | OFF                    |
| innodb_adaptive_flushing                  | ON                     |
| innodb_adaptive_flushing_method           | estimate               |
| innodb_adaptive_hash_index                | ON                     |
| innodb_adaptive_hash_index_partitions     | 1                      |
| innodb_additional_mem_pool_size           | 16777216               |
| innodb_autoextend_increment               | 8                      |
| innodb_autoinc_lock_mode                  | 1                      |
| innodb_blocking_buffer_pool_restore       | OFF                    |
| innodb_buffer_pool_instances              | 1                      |
| innodb_buffer_pool_restore_at_startup     | 0                      |
| innodb_buffer_pool_shm_checksum           | ON                     |
| innodb_buffer_pool_shm_key                | 0                      |
| innodb_buffer_pool_size                   | 21474836480            |
| innodb_change_buffering                   | all                    |
| innodb_checkpoint_age_target              | 0                      |
| innodb_checksums                          | ON                     |
| innodb_commit_concurrency                 | 0                      |
| innodb_concurrency_tickets                | 500                    |
| innodb_corrupt_table_action               | assert                 |
| innodb_data_file_path                     | ibdata1:10M:autoextend |
| innodb_data_home_dir                      |                        |
| innodb_dict_size_limit                    | 0                      |
| innodb_doublewrite                        | ON                     |
| innodb_doublewrite_file                   |                        |
| innodb_fake_changes                       | OFF                    |
| innodb_fast_checksum                      | OFF                    |
| innodb_fast_shutdown                      | 1                      |
| innodb_file_format                        | Antelope               |
| innodb_file_format_check                  | ON                     |
| innodb_file_format_max                    | Antelope               |
| innodb_file_per_table                     | ON                     |
| innodb_flush_log_at_trx_commit            | 0                      |
| innodb_flush_method                       |                        |
| innodb_flush_neighbor_pages               | area                   |
| innodb_force_load_corrupted               | OFF                    |
| innodb_force_recovery                     | 0                      |
| innodb_ibuf_accel_rate                    | 100                    |
| innodb_ibuf_active_contract               | 1                      |
| innodb_ibuf_max_size                      | 10737401856            |
| innodb_import_table_from_xtrabackup       | 0                      |
| innodb_io_capacity                        | 10000                  |
| innodb_kill_idle_transaction              | 0                      |
| innodb_large_prefix                       | OFF                    |
| innodb_lazy_drop_table                    | 0                      |
| innodb_lock_wait_timeout                  | 120                    |
| innodb_locks_unsafe_for_binlog            | OFF                    |
| innodb_log_block_size                     | 512                    |
| innodb_log_buffer_size                    | 8388608                |
| innodb_log_file_size                      | 268435456              |
| innodb_log_files_in_group                 | 3                      |
| innodb_log_group_home_dir                 | ./                     |
| innodb_max_dirty_pages_pct                | 90                     |
| innodb_max_purge_lag                      | 0                      |
| innodb_mirrored_log_groups                | 1                      |
| innodb_old_blocks_pct                     | 37                     |
| innodb_old_blocks_time                    | 0                      |
| innodb_open_files                         | 300                    |
| innodb_page_size                          | 16384                  |
| innodb_purge_batch_size                   | 20                     |
| innodb_purge_threads                      | 1                      |
| innodb_random_read_ahead                  | OFF                    |
| innodb_read_ahead                         | linear                 |
| innodb_read_ahead_threshold               | 56                     |
| innodb_read_io_threads                    | 8                      |
| innodb_recovery_stats                     | OFF                    |
| innodb_recovery_update_relay_log          | OFF                    |
| innodb_replication_delay                  | 0                      |
| innodb_rollback_on_timeout                | OFF                    |
| innodb_rollback_segments                  | 128                    |
| innodb_show_locks_held                    | 10                     |
| innodb_show_verbose_locks                 | 0                      |
| innodb_spin_wait_delay                    | 6                      |
| innodb_stats_auto_update                  | 1                      |
| innodb_stats_method                       | nulls_equal            |
| innodb_stats_on_metadata                  | ON                     |
| innodb_stats_sample_pages                 | 8                      |
| innodb_stats_update_need_lock             | 1                      |
| innodb_strict_mode                        | OFF                    |
| innodb_support_xa                         | ON                     |
| innodb_sync_spin_loops                    | 30                     |
| innodb_table_locks                        | ON                     |
| innodb_thread_concurrency                 | 16                     |
| innodb_thread_concurrency_timer_based     | OFF                    |
| innodb_thread_sleep_delay                 | 10000                  |
| innodb_use_global_flush_log_at_trx_commit | ON                     |
| innodb_use_native_aio                     | ON                     |
| innodb_use_sys_malloc                     | ON                     |
| innodb_use_sys_stats_table                | OFF                    |
| innodb_version                            | 1.1.8-rel25.1          |
| innodb_write_io_threads                   | 8                      |
+-------------------------------------------+------------------------+
92 rows in set (0.00 sec)

I've been struggling with this problem for weeks and would be very greatfull for any advice on how to solve this problem.

Why could my update, insert and delete queries be so slow on query end step?

update

I have disabled query cache, but update, insert and delete queries are still very very slow (nothing changed)

show variables like '%cache%';
+------------------------------+----------------------+
| Variable_name                | Value                |
+------------------------------+----------------------+
| binlog_cache_size            | 4194304              |
| binlog_stmt_cache_size       | 32768                |
| have_query_cache             | YES                  |
| key_cache_age_threshold      | 300                  |
| key_cache_block_size         | 1024                 |
| key_cache_division_limit     | 100                  |
| max_binlog_cache_size        | 18446744073709547520 |
| max_binlog_stmt_cache_size   | 18446744073709547520 |
| metadata_locks_cache_size    | 1024                 |
| query_cache_limit            | 16777216             |
| query_cache_min_res_unit     | 4096                 |
| query_cache_size             | 0                    |
| query_cache_strip_comments   | OFF                  |
| query_cache_type             | ON                   |
| query_cache_wlock_invalidate | OFF                  |
| stored_program_cache         | 256                  |
| table_definition_cache       | 400                  |
| table_open_cache             | 2048                 |
| thread_cache_size            | 8                    |
+------------------------------+----------------------+

Source: (StackOverflow)

Set Identity_insert on - Merge Replication

I have merge replication set up between two databases and am using identity ranges on both.

I want to add a specific row to a merged table (setting the identity value to something outside of the identity range) on the publisher. When I try this, I get the following error.

The insert failed. It conflicted with an identity range check constraint in database 'xxx', replicated table 'dbo.yyy', column 'yyy_id'. If the identity column is automatically managed by replication, update the range as follows: for the Publisher, execute sp_adjustpublisheridentityrange; for the Subscriber, run the Distribution Agent or the Merge Agent.

Is there a way to force specific identity value into a merge replicated table that is using identity range management?


Source: (StackOverflow)

Selective replication with CouchDB

I'm currently evaluating possible solutions to the follwing problem:

A set of data entries must be synchonized between multiple clients, where each client may only view (or even know about the existence of) a subset of the data. Each client "owns" some of the elements, and the decision who else can read or modify those elements may only be made by the owner. To complicate this situation even more, each element (and each element revision) must have an unique identifier that is equal for all clients.

While the latter sounds like a perfect task for CouchDB (and a document based data model would fit my needs perfectly), I'm not sure if the authentication/authorization subsystem of CouchDB can handle these requirements: While it should be possible to restict write access using validation functions, there doesn't seem to be a way to authorize read access. All solutions I've found for this problem propose to route all CouchDB requests through a proxy (or an application layer) that handles authorization.

So, the question is: Is it possible to implement an authorization layer that filters requests to the database so that access is granted only to documents that the requesting client has read access to and still use the replication mechanism of CouchDB? Simplified, this would be some kind of "selective replication" where only some of the documents, and not the whole database is replicated.

I would also be thankful for directions to some detailed information about how replication works. The CouchDB wiki and even the "Definite Guide" Book are not too specific about that.


Source: (StackOverflow)

Updating AUTO_INCREMENT value of all tables in a MySQL database

It is possbile set/reset the AUTO_INCREMENT value of a MySQL table via

ALTER TABLE some_table AUTO_INCREMENT = 1000

However I need to set the AUTO_INCREMENTupon its existing value (to fix M-M replication), something like:

ALTER TABLE some_table SET AUTO_INCREMENT = AUTO_INCREMENT + 1 which is not working

Well actually, I would like to run this query for all tables within a database. But actually this is not very crucial.

I could not find out a way to deal with this problem, except running the queries manually. Will you please suggest something or point me out to some ideas.

Thanks


Source: (StackOverflow)

Read/Write splitting Hibernate

I have a quite heavy java webapp that serves thousands of requests/sec and it uses a master Postgresql db which replicates itself to one secondary (read-only) database using streaming (asynchronous) replication.

So, I separate the request from primary to secondary(read-only) using URLs to avoid read-only calls to bug primary database considering replication time is minimal.

NOTE: I use one sessionFactory with a RoutingDataSource provided by spring that looks up db to use based on a key. I am interested in multitenancy as I am using hibernate 4.3.4 that supports it.

I have two questions:

  1. I dont think splitting on the basis of URLs is efficient as I can only move 10% of traffic around means there are not many read-only URLs. What approach should I consider?
  2. May be,somehow, on the basis of URLs I achieve some level of distribution among both nodes but what would I do with my quartz jobs(that even have separate JVM)? What pragmatic approach should I take?

I know I might not get a perfect answer here as this really is broad but I just want your opinion for the context.

Dudes I have in my team:

  • Spring4
  • Hibernate4
  • Quartz2.2
  • Java7 / Tomcat7

Please take interest. Thanks in advance.


Source: (StackOverflow)