307403 |
16-Oct-2016 |
sevan |
MFC r267667: use .Mt to mark up email addresses consistently (part1)
PR: 191174 Submitted by: Franco Fichtner <franco at lastsummer.de> |
305142 |
31-Aug-2016 |
dim |
MFC r304969:
Define hastd's STRICT_ALIGN macro in a defined and portable way. |
280250 |
19-Mar-2015 |
rwatson |
Merge an applicable subset of r263234 from HEAD to stable/10:
Update most userspace consumers of capability.h to use capsicum.h instead.
auditdistd is not updated as I will make the change upstream and then do a vendor import sometime in the next week or two.
Note that a significant fraction does not apply, as FreeBSD 10 doesn't contain a Capsicumised ping, casperd, libcasper, etc. When these features are merged, the capsicum.h change will need to be merged with them.
Sponsored by: Google, Inc. |
270911 |
01-Sep-2014 |
ngie |
MFC r270433:
Garbage collect libl dependency
The application links and runs without libl
Approved by: rpaulo (mentor) Phabric: D673 Submitted by: trociny |
270910 |
01-Sep-2014 |
ngie |
MFC r270117:
Add -ll to LDADD to fix "make checkdpadd"
Phabric: D622 Approved by: rpaulo (mentor) |
262192 |
18-Feb-2014 |
jhb |
MFC 261517,261520: Convert the license on files where I am the sole copyright holder to 2 clause BSD licenses. |
260006 |
28-Dec-2013 |
trociny |
MFC r257155, r257582, r259191, r259192, r259193, r259194, r259195, r259196:
r257155:
Make hastctl list command output current queue sizes.
Reviewed by: pjd
r257582 (pjd):
Correct alignment.
r259191:
For memsync replication, hio_countdown is used not only as an indication when a request can be moved to done queue, but also for detecting the current state of memsync request.
This approach has problems, e.g. leaking a request if memsynk ack from the secondary failed, or racy usage of write_complete, which should be called only once per write request, but for memsync can be entered by local_send_thread and ggate_send_thread simultaneously.
So the following approach is implemented instead:
1) Use hio_countdown only for counting components we waiting to complete, i.e. initially it is always 2 for any replication mode.
2) To distinguish between "memsync ack" and "memsync fin" responses from the secondary, add and use hio_memsyncacked field.
3) write_complete() in component threads is called only before releasing hio_countdown (i.e. before the hio may be returned to the done queue).
4) Add and use hio_writecount refcounter to detect when write_complete() can be called in memsync case.
Reported by: Pete French petefrench ingresso.co.uk Tested by: Pete French petefrench ingresso.co.uk
r259192:
Add some macros to make the code more readable (no functional chages).
r259193:
Fix compiler warnings.
r259194:
In remote_send_thread, if sending a request fails don't take the request back from the receive queue -- it might already be processed by remote_recv_thread, which lead to crashes like below:
(primary) Unable to receive reply header: Connection reset by peer. (primary) Unable to send request (Connection reset by peer): WRITE(954662912, 131072). (primary) Disconnected from kopusha:7772. (primary) Increasing localcnt to 1. (primary) Assertion failed: (old > 0), function refcnt_release, file refcnt.h, line 62.
Taking the request back was not necessary (it would properly be processed by the remote_recv_thread) and only complicated things.
r259195:
Send wakeup to threads waiting on empty queue before releasing the lock to decrease spurious wakeups.
Submitted by: davidxu
r259196:
Check remote protocol version only for the first connection (when it is actually sent by the remote node).
Otherwise it generated confusing "Negotiated protocol version 1" debug messages when processing the second connection. |
259073 |
07-Dec-2013 |
peter |
Hoist all the mergeinfo up to the root in preparation for enforcing merges to the root only. All MFC's were rerecorded to the root.
Going forward, if an MFC includes mergeinfo, it will need to be made to the root and committed from the root. Merges with --ignore-ancestry or diff | patch can go anywhere.
The mergeinfo in HEAD is in a bad state from years of neglect and manual tampering and this was branched into 10.x. This confuses the coalescing code and prevents it from doing its job.
Approved by: re (gjb, implicit) |
257468 |
31-Oct-2013 |
trociny |
MFC r257154:
Merging local and remote bitmaps must be protected by hr_amp lock.
This is believed to fix hastd crashes, which might occur during synchronization, triggered by the failed assertion:
Assertion failed: (amp->am_memtab[ext] > 0), function activemap_write_complete, file activemap.c, line 351.
Approved by: re (glebius) |
256281 |
10-Oct-2013 |
gjb |
Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation
|
255717 |
19-Sep-2013 |
trociny |
Fix comments.
Approved by: re (marius) MFC after: 3 days
|
255716 |
19-Sep-2013 |
trociny |
When updating the map of dirty extents, most recently used extents are kept dirty to reduce the number of on-disk metadata updates. The sequence of operations is:
1) acquire the activemap lock; 2) update in-memory map; 3) if the list of keepdirty extents is changed, update on-disk metadata; 4) release the lock.
On-disk updates are not frequent in comparison with in-memory updates, while require much more time. So situations are possible when one thread is updating on-disk metadata and another one is waiting for the activemap lock just to update the in-memory map.
Improve this by introducing additional, on-disk map lock: when in-memory map is updated and it is detected that the on-disk map needs update too, the on-disk map lock is acquired and the on-memory lock is released before flushing the map.
Reported by: Yamagi Burmeister yamagi.org Tested by: Yamagi Burmeister yamagi.org Reviewed by: pjd Approved by: re (marius) MFC after: 2 weeks
|
255714 |
19-Sep-2013 |
trociny |
Use cv_broadcast() instead of cv_signal() when waking up threads waiting on an empty queue as the queue may have several consumers.
Before the fix the following scenario was possible: 2 threads are waiting on empty queue, 2 threads are inserting simultaneously. The first inserting thread detects that the queue is empty and is going to send the signal, but before it sends the second thread inserts too. When the first sends the signal only one of the waiting threads receive it while the other one may wait forever.
The scenario above is is believed to be the cause of the observed cases, when ggate_recv_thread() was getting stuck on taking free request, while the free queue was not empty.
Reviewed by: pjd Tested by: Yamagi Burmeister yamagi.org Approved by: re (marius) MFC after: 2 weeks
|
255219 |
05-Sep-2013 |
pjd |
Change the cap_rights_t type from uint64_t to a structure that we can extend in the future in a backward compatible (API and ABI) way.
The cap_rights_t represents capability rights. We used to use one bit to represent one right, but we are running out of spare bits. Currently the new structure provides place for 114 rights (so 50 more than the previous cap_rights_t), but it is possible to grow the structure to hold at least 285 rights, although we can make it even larger if 285 rights won't be enough.
The structure definition looks like this:
struct cap_rights { uint64_t cr_rights[CAP_RIGHTS_VERSION + 2]; };
The initial CAP_RIGHTS_VERSION is 0.
The top two bits in the first element of the cr_rights[] array contain total number of elements in the array - 2. This means if those two bits are equal to 0, we have 2 array elements.
The top two bits in all remaining array elements should be 0. The next five bits in all array elements contain array index. Only one bit is used and bit position in this five-bits range defines array index. This means there can be at most five array elements in the future.
To define new right the CAPRIGHT() macro must be used. The macro takes two arguments - an array index and a bit to set, eg.
#define CAP_PDKILL CAPRIGHT(1, 0x0000000000000800ULL)
We still support aliases that combine few rights, but the rights have to belong to the same array element, eg:
#define CAP_LOOKUP CAPRIGHT(0, 0x0000000000000400ULL) #define CAP_FCHMOD CAPRIGHT(0, 0x0000000000002000ULL)
#define CAP_FCHMODAT (CAP_FCHMOD | CAP_LOOKUP)
There is new API to manage the new cap_rights_t structure:
cap_rights_t *cap_rights_init(cap_rights_t *rights, ...); void cap_rights_set(cap_rights_t *rights, ...); void cap_rights_clear(cap_rights_t *rights, ...); bool cap_rights_is_set(const cap_rights_t *rights, ...);
bool cap_rights_is_valid(const cap_rights_t *rights); void cap_rights_merge(cap_rights_t *dst, const cap_rights_t *src); void cap_rights_remove(cap_rights_t *dst, const cap_rights_t *src); bool cap_rights_contains(const cap_rights_t *big, const cap_rights_t *little);
Capability rights to the cap_rights_init(), cap_rights_set(), cap_rights_clear() and cap_rights_is_set() functions are provided by separating them with commas, eg:
cap_rights_t rights;
cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT);
There is no need to terminate the list of rights, as those functions are actually macros that take care of the termination, eg:
#define cap_rights_set(rights, ...) \ __cap_rights_set((rights), __VA_ARGS__, 0ULL) void __cap_rights_set(cap_rights_t *rights, ...);
Thanks to using one bit as an array index we can assert in those functions that there are no two rights belonging to different array elements provided together. For example this is illegal and will be detected, because CAP_LOOKUP belongs to element 0 and CAP_PDKILL to element 1:
cap_rights_init(&rights, CAP_LOOKUP | CAP_PDKILL);
Providing several rights that belongs to the same array's element this way is correct, but is not advised. It should only be used for aliases definition.
This commit also breaks compatibility with some existing Capsicum system calls, but I see no other way to do that. This should be fine as Capsicum is still experimental and this change is not going to 9.x.
Sponsored by: The FreeBSD Foundation
|
252472 |
01-Jul-2013 |
trociny |
Make hastctl(1) ('list' command) output a worker pid.
Reviewed by: pjd MFC after: 3 days
|
252421 |
30-Jun-2013 |
schweikh |
Correct some grammar.
|
252386 |
29-Jun-2013 |
ed |
Don't let hastd use C11 atomics.
Due to possible concerns about the stability of C11 atomics, use our existing atomics API instead.
Requested by: pjd
|
251796 |
15-Jun-2013 |
ed |
Let hastd use C11 atomics.
C11 atomics now work on all the architectures. Have at least a single piece of software in our base system that uses C11 atomics. This somewhat makes it less likely that we break it because of LLVM imports, etc.
|
250914 |
22-May-2013 |
jkim |
Improve compatibility with old flex and fix build with GCC.
|
250503 |
11-May-2013 |
trociny |
Get rid of libl dependency. We needed it only to provide yywrap. But yywrap is not necessary when parsing a single hast.conf file.
Suggested by: kib Reviewed by: pjd
|
249970 |
27-Apr-2013 |
ed |
Partially revert my last change.
I forgot that I still had a locally applied patch to my copy of Clang that needs to be pushed in before we should use C11 atomics.
|
249969 |
27-Apr-2013 |
ed |
Use C11 <stdatomic.h> instead of our non-standard <machine/atomic.h>.
Reviewed by: pjd
|
249657 |
19-Apr-2013 |
ed |
Add the Clang specific -Wmissing-variable-declarations to WARNS=6.
This compiler flag enforces that that people either mark variables static or use an external declarations for the variable, similar to how -Wmissing-prototypes works for functions.
Due to the fact that Yacc/Lex generate code that cannot trivially be changed to not warn because of this (lots of yy* variables), add a NO_WMISSING_VARIABLE_DECLARATIONS that can be used to turn off this specific compiler warning.
Announced on: toolchain@
|
248297 |
14-Mar-2013 |
pjd |
Now that ioctl(2) is allowed in capability mode and we can limit ioctls for the given descriptors, use Capsicum sandboxing for hastd in primary and secondary modes. Allow for DIOCGDELETE and DIOCGFLUSH ioctls on provider descriptor and for G_GATE_CMD_MODIFY, G_GATE_CMD_START, G_GATE_CMD_DONE and G_GATE_CMD_DESTROY on GEOM Gate descriptor.
Sponsored by: The FreeBSD Foundation
|
248296 |
14-Mar-2013 |
pjd |
Minor corrections.
|
248294 |
14-Mar-2013 |
pjd |
Delete requests can be larger than MAXPHYS.
|
247281 |
25-Feb-2013 |
trociny |
Add i/o error counters to hastd(8) and make hastctl(8) display them. This may be useful for detecting problems with HAST disks.
Discussed with and reviewed by: pjd MFC after: 1 week
|
246922 |
17-Feb-2013 |
pjd |
- Add support for 'memsync' mode. This is the fastest replication mode that's why it will now be the default. - Bump protocol version to 2 and add backward compatibility for version 1. - Allow to specify hosts by kern.hostid as well (in addition to hostname and kern.hostuuid) in configuration file.
Sponsored by: Panzura Tested by: trociny
|
244538 |
21-Dec-2012 |
kevlo |
Fix socket calls on error post-r243965.
Submitted by: Garrett Cooper
|
242593 |
05-Nov-2012 |
pjd |
Revert r228695. We use __func__ here as a format to distinguish between abort and assert. It would be cleaner to use NULL or "" here, but gcc complains in both cases.
|
238538 |
16-Jul-2012 |
trociny |
Metaflush on/off values don't need quotes.
Reviewed by: pjd MFC after: 3 days
|
238120 |
04-Jul-2012 |
pjd |
Make use of GEOM Gate direct reads feature. This allows HAST to serve reads with native speed of the underlying provider. There are three situations when direct reads are not used: 1. Data is being synchronized and synchronization source is the secondary node, which means secondary node has more recent data and we should read from it. 2. Local read failed and we have to try to read from the secondary node. 3. Local component is unavailable and all I/O requests are served from the secondary node.
Sponsored by: Panzura, http://www.panzura.com MFC after: 1 month
|
237931 |
01-Jul-2012 |
pjd |
Check if there is cmsg at all.
MFC after: 3 days
|
236919 |
11-Jun-2012 |
hselasky |
Revert: r236909
Pointyhat: me
|
236909 |
11-Jun-2012 |
hselasky |
Use the correct clock source when computing timeouts.
MFC after: 1 week
|
236507 |
03-Jun-2012 |
pjd |
Simplify the code by using snprlcat().
MFC after: 3 days
|
235873 |
24-May-2012 |
wblock |
Fixes to man8 groff mandoc style, usage mistakes, or typos.
PR: 168016 Submitted by: Nobuyuki Koganemaru Approved by: gjb MFC after: 3 days
|
235789 |
22-May-2012 |
bapt |
Fix world after byacc import: - old yacc(1) use to magicially append stdlib.h, while new one don't - new yacc(1) do declare yyparse by itself, fix redundant declaration of 'yyparse'
Approved by: des (mentor)
|
235337 |
12-May-2012 |
gjb |
General mdoc(7) and typo fixes.
PR: 167804 Submitted by: Nobuyuki Koganemaru (kogane!jp.freebsd.org) MFC after: 3 days
|
233679 |
29-Mar-2012 |
trociny |
If hastd is invoked with "-P pidfile" option always create pidfile regardless of whether -F (foreground) option is set or not.
Also, if -P option is specified, ignore pidfile setting from configuration not only on start but on reload too. This fixes the issue when for hastd run with -P option reload caused the pidfile change.
Reviewed by: pjd MFC after: 1 week
|
233392 |
23-Mar-2012 |
trociny |
Fix typo.
MFC after: 3 days
|
231525 |
11-Feb-2012 |
pjd |
Nice range comparison.
MFC after: 3 days
|
231016 |
05-Feb-2012 |
trociny |
If a local write request is from the synchronization thread, when it is synchronizing data that is out of date on the local component, we should not send G_GATE_CMD_DONE acknowledge to the kernel.
This fixes the issue, observed in async mode, when on synchronization from the remote component the worker terminated with "G_GATE_CMD_DONE failed" error.
Reported by: Artem Kajalainen <artem kayalaynen ru> Reviewed by: pjd MFC after: 1 week
|
231015 |
05-Feb-2012 |
trociny |
Fix the regression introduced in r226859: if the local component is out of date BIO_READ requests got lost instead of being sent to the remote component.
Reviewed by: pjd MFC after: 1 week
|
230976 |
04-Feb-2012 |
pjd |
Fix typo in comment.
MFC after: 3 days
|
230515 |
24-Jan-2012 |
pjd |
- Fix documentation to note that /etc/hast.conf is the default configuration file for hastd(8) and hastctl(8) and not hast.conf. - In copyright statement correct that this file is documentation, not software. - Bump date.
MFC after: 3 days
|
230457 |
22-Jan-2012 |
pjd |
Free memory that won't be used in child.
MFC after: 1 week
|
230436 |
21-Jan-2012 |
pjd |
Fix minor memory leak.
MFC after: 3 days
|
230396 |
20-Jan-2012 |
pjd |
Remove another unused token.
MFC after: 3 days
|
230395 |
20-Jan-2012 |
pjd |
Remove unused token 'port'.
MFC after: 3 days
|
230092 |
13-Jan-2012 |
pjd |
Style cleanups.
MFC after: 3 days
|
229946 |
10-Jan-2012 |
pjd |
- Fix a bug where pidfile was removed in SIGHUP when it hasn't changed in configuration file. - Log the fact that pidfile has changed.
MFC after: 3 days
|
229945 |
10-Jan-2012 |
pjd |
For functions that return -1 on failure check exactly for -1 and not for any negative number.
MFC after: 3 days
|
229944 |
10-Jan-2012 |
pjd |
Don't touch pidfiles when running in foreground. Before that change we would create an empty pidfile on start and check if it changed on SIGHUP.
MFC after: 3 days
|
229778 |
07-Jan-2012 |
uqs |
Spelling fixes for sbin/
|
229744 |
06-Jan-2012 |
pjd |
fork(2) returns -1 on failure, not some random negative number.
MFC after: 3 days
|
229699 |
06-Jan-2012 |
pjd |
Constify argument.
MFC after: 3 days
|
228712 |
19-Dec-2011 |
dim |
Use NO_WCAST_ALIGN for usr.bin/hastctl and usr.bin/hastd; the alignment warnings in sbin/hastd/lzf.c are only emitted for i386 and amd64, and there they can be safely ignored.
MFC after: 1 week
|
228696 |
18-Dec-2011 |
pjd |
Use lex's standard way of not generating unused function.
Inspired by: r228555 MFC after: 1 week
|
228695 |
18-Dec-2011 |
pjd |
Don't use function name as format string.
Detected by: clang MFC after: 1 week
|
228544 |
15-Dec-2011 |
pjd |
Remove redundant assignment.
Found by: Clang Static Analyzer MFC after: 1 week
|
228543 |
15-Dec-2011 |
pjd |
Simplify code by changing functions types from int to avoid, as the functions always return 0.
Found by: Clang Static Analyzer MFC after: 1 week
|
228542 |
15-Dec-2011 |
pjd |
Remove redundant setting of the error variable.
Found by: Clang Static Analyzer MFC after: 1 week
|
226861 |
27-Oct-2011 |
pjd |
Remove redundant space.
MFC after: 3 days
|
226859 |
27-Oct-2011 |
pjd |
Implement 'async' mode for HAST.
MFC after: 3 days
|
226857 |
27-Oct-2011 |
pjd |
Minor cleanups.
MFC after: 3 days
|
226856 |
27-Oct-2011 |
pjd |
Reduce indentation.
MFC after: 3 days
|
226855 |
27-Oct-2011 |
pjd |
Improve comment so it doesn't suggest race is possible, but that we handle the race.
MFC after: 3 days
|
226854 |
27-Oct-2011 |
pjd |
- Eliminate the need for hio_nv. - Introduce hio_clear() function for clearing hio before returning it onto free queue.
MFC after: 3 days
|
226852 |
27-Oct-2011 |
pjd |
Monor cleanups.
MFC after: 3 days
|
226851 |
27-Oct-2011 |
pjd |
Delay resuid generation until first connection to secondary, not until first write. This way on first connection we will synchronize only the extents that were modified during the lifetime of primary node, not entire GEOM provider.
MFC after: 3 days
|
226842 |
27-Oct-2011 |
pjd |
Correct comments.
MFC after: 3 days
|
226463 |
17-Oct-2011 |
pjd |
Allow to specify pidfile in HAST configuration file.
MFC after: 1 week
|
226462 |
17-Oct-2011 |
pjd |
Remove redundant space.
MFC after: 1 week
|
226461 |
17-Oct-2011 |
pjd |
When path to the configuration file is relative, obtain full path, so we can always find the file, even after daemonizing and changing working directory to /.
MFC after: 1 week
|
225835 |
28-Sep-2011 |
pjd |
Correct typo.
MFC after: 3 days
|
225832 |
28-Sep-2011 |
pjd |
If the underlying provider doesn't support BIO_FLUSH, log it only once and don't bother trying in the future.
MFC after: 3 days
|
225831 |
28-Sep-2011 |
pjd |
Break a bit earlier.
MFC after: 3 days
|
225830 |
28-Sep-2011 |
pjd |
After every activemap change flush disk's write cache, so that write reordering won't make the actual write to be committed before marking the coresponding extent as dirty.
It can be disabled in configuration file.
If BIO_FLUSH is not supported by the underlying file system we log a warning and never send BIO_FLUSH again to that GEOM provider.
MFC after: 3 days
|
225787 |
27-Sep-2011 |
pjd |
Use PJDLOG_ASSERT() and PJDLOG_ABORT() everywhere instead of assert().
MFC after: 3 days
|
225786 |
27-Sep-2011 |
pjd |
No need to wrap pjdlog functions around with KEEP_ERRNO() macro.
MFC after: 3 days
|
225784 |
27-Sep-2011 |
pjd |
- Convert some impossible conditions into assertions. - Add missing 'if' in comment.
MFC after: 3 days
|
225783 |
27-Sep-2011 |
pjd |
Correct two mistakes when converting asserts to PJDLOG_ASSERT()/PJDLOG_ABORT().
MFC after: 3 days
|
225782 |
27-Sep-2011 |
pjd |
Prefer PJDLOG_ASSERT() and PJDLOG_ABORT() over assert() and abort(). pjdlog versions will log problem to syslog when application is running in background.
MFC after: 3 days
|
225781 |
27-Sep-2011 |
pjd |
No need to use KEEP_ERRNO() macro around pjdlog functions, as they don't modify errno.
MFC after: 3 days
|
225773 |
27-Sep-2011 |
pjd |
Ensure that pjdlog functions don't modify errno.
MFC after: 3 days
|
223974 |
13-Jul-2011 |
trociny |
Fix indentation.
Approved by: pjd (mentor)
|
223780 |
05-Jul-2011 |
trociny |
Remove useless initialization.
Approved by: pjd (mentor) MFC after: 3 days
|
223655 |
28-Jun-2011 |
trociny |
Check the returned value of activemap_write_complete() and update matadata on disk if needed. This should fix a potential case when extents are cleared in activemap but metadata is not updated on disk.
Suggested by: pjd Approved by: pjd (mentor)
|
223654 |
28-Jun-2011 |
trociny |
Make activemap_write_start/complete check the keepdirty list, when stating if we need to update activemap on disk. This makes keepdirty serve its purpose -- to reduce number of metadata updates.
Discussed with: pjd Approved by: pjd (mentor)
|
223586 |
27-Jun-2011 |
pjd |
Compile hastd and hastctl with capsicum support.
X-MFC after: capsicum merge
|
223585 |
27-Jun-2011 |
pjd |
Compile capsicum support only if HAVE_CAPSICUM is defined.
MFC after: 3 days
|
223584 |
27-Jun-2011 |
pjd |
Log a warning if we cannot sandbox using capsicum, but only under debug level 1. It would be too noisy to log it as a proper warning as CAPABILITIES are not compiled into GENERIC by default.
MFC after: 3 days
|
223181 |
17-Jun-2011 |
trociny |
In HAST we use two sockets - one for only sending the data and one for only receiving the data. In r220271 the unused directions were disabled using shutdown(2).
Unfortunately, this broke automatic receive buffer sizing, which currently works only for connections in ETASBLISHED state. It was a root cause of the issue reported by users, when connection between primary and secondary could get stuck.
Disable the code introduced in r220271 until the issue with automatic buffer sizing is not resolved.
Reported by: Daniel Kalchev <daniel@digsys.bg>, danger, sobomax Tested by: Daniel Kalchev <daniel@digsys.bg>, danger Approved by: pjd (mentor) MFC after: 1 week
|
223143 |
16-Jun-2011 |
sobomax |
Revert r222688.
Requested by: Mikolaj Golub
|
222688 |
04-Jun-2011 |
sobomax |
Read from the socket using the same max buffer size as we use while sending. What happens otherwise is that the sender splits all the traffic into 32k chunks, while the receiver is waiting for the whole packet. Then for a certain packet sizes, particularly 66607 bytes in my case, the communication stucks to secondary is expecting to read one chunk of 66607 bytes, while primary is sending two chunks of 32768 bytes and third chunk of 1071. Probably due to TCP windowing and buffering the final chunk gets stuck somewhere, so neither server not client can make any progress.
This patch also protect from short reads, as according to the manual page there are some cases when MSG_WAITALL can give less data than expected.
MFC after: 3 days
|
222467 |
29-May-2011 |
trociny |
If READ from the local node failed we send the request to the remote node. There is no use in doing this for synchronization requests.
Approved by: pjd (mentor) MFC after: 1 week
|
222228 |
23-May-2011 |
pjd |
Keep statistics on number of BIO_READ, BIO_WRITE, BIO_DELETE and BIO_FLUSH requests as well as number of activemap updates.
Number of BIO_WRITEs and activemap updates are especially interesting, because if those two are too close to each other, it means that your workload needs bigger number of dirty extents. Activemap should be updated as rarely as possible.
MFC after: 1 week
|
222224 |
23-May-2011 |
pjd |
To handle BIO_FLUSH and BIO_DELETE requests in secondary worker we need to use ioctl(2). This is why we can't use capsicum for now to sandbox secondary. Capsicum is still used to sandbox hastctl.
MFC after: 1 week
|
222164 |
21-May-2011 |
pjd |
Recognize HIO_FLUSH requests.
MFC after: 1 week
|
222121 |
20-May-2011 |
pjd |
Document IPv6 support.
MFC after: 3 weeks
|
222120 |
20-May-2011 |
pjd |
If no listen address is specified, bind by default to:
tcp4://0.0.0.0:8457 tcp6://[::]:8457
MFC after: 3 weeks
|
222119 |
20-May-2011 |
pjd |
Rename ipv4/ipv6 to tcp4/tcp6.
MFC after: 3 weeks
|
222118 |
20-May-2011 |
pjd |
Now that hell is fully frozen it is good time to add IPv6 support to HAST.
MFC after: 3 weeks
|
222117 |
20-May-2011 |
pjd |
Allow [ ] characters in strings. They might be used in IPv6 addresses.
MFC after: 3 weeks
|
222116 |
20-May-2011 |
pjd |
Rename tcp4 to tcp in preparation for IPv6 support.
MFC after: 3 weeks
|
222115 |
20-May-2011 |
pjd |
Rename proto_tcp4.c to proto_tcp.c in preparation for IPv6 support.
MFC after: 2 weeks
|
222108 |
19-May-2011 |
pjd |
In preparation for IPv6 support allow to specify multiple addresses to listen on.
MFC after: 3 weeks
|
222087 |
18-May-2011 |
pjd |
- Add support for AF_INET6 sockets for %S format character. - Use inet_ntop(3) instead of reimplementing it. - Use %hhu for unsigned char instead of casting it to unsigned int and using %u.
MFC after: 1 week
|
221899 |
14-May-2011 |
pjd |
Currently we are unable to use capsicum for the primary worker process, because we need to do ioctl(2)s, which are not permitted in the capability mode. What we do now is to chroot(2) to /var/empty, which restricts access to file system name space and we drop privileges to hast user and hast group.
This still allows to access to other name spaces, like list of processes, network and sysvipc.
To address that, use jail(2) instead of chroot(2). Using jail(2) will restrict access to process table, network (we use ip-less jails) and sysvipc (if security.jail.sysvipc_allowed is turned off). This provides much better separation.
MFC after: 1 week
|
221898 |
14-May-2011 |
pjd |
When using capsicum to sanbox, still use other methods first, just in case one of them have some problems.
|
221643 |
08-May-2011 |
pjd |
Allow to specify remote as 'none' again which was broken by r219351, where 'none' was defined as a value for checksum.
Reported by: trasz MFC after: 1 week
|
221632 |
08-May-2011 |
trociny |
Fix isitme(), which is used to check if node-specific configuration belongs to our node, and was returning false positive if the first part of a node name matches short hostname.
Approved by: pjd (mentor)
|
221078 |
26-Apr-2011 |
trociny |
Add missing ifdef. This fixes build with NO_OPENSSL.
Reported by: Pawel Tyll <ptyll@nitronet.pl> Approved by: pjd (mentor) MFC after: 1 week
|
221076 |
26-Apr-2011 |
trociny |
Rename HASTCTL_ defines, which are used for conversion between main hastd process and workers, remove unused one and set different range of numbers. This is done in order not to confuse them with HASTCTL_CMD defines, used for conversation between hastctl and hastd, and to avoid bugs like the one fixed in in r221075.
Approved by: pjd (mentor) MFC after: 1 week
|
221075 |
26-Apr-2011 |
trociny |
For conversation between hastctl and hastd we should use HASTCTL_CMD defines.
Approved by: pjd (mentor) MFC after: 1 week
|
220899 |
20-Apr-2011 |
pjd |
Correct comment.
MFC after: 1 week
|
220898 |
20-Apr-2011 |
pjd |
When we become primary, we connect to the remote and expect it to be in secondary role. It is possible that the remote node is primary, but only because there was a role change and it didn't finish cleaning up (unmounting file systems, etc.). If we detect such situation, wait for the remote node to switch the role to secondary before accepting I/Os. If we don't wait for it in that case, we will most likely cause split-brain.
MFC after: 1 week
|
220890 |
20-Apr-2011 |
pjd |
If we act in different role than requested by the remote node, log it as a warning and not an error.
MFC after: 1 week
|
220889 |
20-Apr-2011 |
pjd |
Timeout must be positive.
MFC after: 1 week
|
220865 |
19-Apr-2011 |
pjd |
Scenario: - We have two nodes connected and synchronized (local counters on both sides are 0). - We take secondary down and recreate it. - Primary connects to it and starts synchronization (but local counters are still 0). - We switch the roles. - Synchronization restarts but data is synchronized now from new primary (because local counters are 0) that doesn't have new data yet.
This fix this issue we bump local counter on primary when we discover that connected secondary was recreated and has no data yet.
Reported by: trociny Discussed with: trociny Tested by: trociny MFC after: 1 week
|
220744 |
17-Apr-2011 |
trociny |
Remove hast_proto_recv(). It was used only in one place, where hast_proto_recv_hdr() may be used. This also fixes the issue (introduced by r220523) with hastctl, which crashed on assert in hast_proto_recv_data().
Suggested and approved by: pjd (mentor)
|
220573 |
12-Apr-2011 |
pjd |
The replication mode that is currently support is fullsync, not memsync. Correct this and print a warning if different replication mode is configured.
MFC after: 1 week
|
220523 |
10-Apr-2011 |
trociny |
In hast_proto_recv() remove unnecessary check. The size is checked later in hast_proto_recv_data().
Approved by: pjd (mentor) MFC after: 1 week
|
220522 |
10-Apr-2011 |
trociny |
In hast_proto_recv_data() check that the size of the data to be received does not exceed the buffer size.
Approved by: pjd (mentor) MFC after: 1 week
|
220521 |
10-Apr-2011 |
trociny |
Fix a typo in comments.
Approved by: pjd (mentor) MFC after: 3 days
|
220274 |
02-Apr-2011 |
pjd |
Increase default timeout from 5 seconds to 20 seconds. 5 seconds is definitely to short under heavy load and I was experiencing those timeouts in my recent tests.
MFC after: 1 week
|
220273 |
02-Apr-2011 |
pjd |
Handle ENOBUFS on send(2) by retrying for a while and logging the problem.
MFC after: 1 week
|
220272 |
02-Apr-2011 |
pjd |
When we are operating on blocking socket and get EAGAIN on send(2) or recv(2) this means that request timed out. Translate the meaningless EAGAIN to ETIMEDOUT to give administrator a hint that he might need to increase timeout in configuration file.
MFC after: 1 month
|
220271 |
02-Apr-2011 |
pjd |
Declare directions for sockets between primary and secondary. In HAST we use two sockets - one for only sending the data and one for only receiving the data.
MFC after: 1 month
|
220270 |
02-Apr-2011 |
pjd |
Allow to disable sends or receives on a socket using shutdown(2) by interpreting NULL 'data' argument passed to proto_common_send() or proto_common_recv() as a will to do so.
MFC after: 1 month
|
220266 |
02-Apr-2011 |
pjd |
Handle the problem described in r220264 by using GEOM GATE queue of unlimited length. This should fix deadlocks reported by HAST users.
MFC after: 1 week
|
220007 |
25-Mar-2011 |
pjd |
Add mapsize to the header just before sending the packet. Before it could change later and we were sending invalid mapsize. Some time ago I added optimization where when nodes are connected for the first time and there were no writes to them yet, there is no initial full synchronization. This bug prevented it from working.
MFC after: 1 week
|
220006 |
25-Mar-2011 |
pjd |
Use timeout from configuration file not only when sending and receiving, but also when establishing connection.
MFC after: 1 week
|
220005 |
25-Mar-2011 |
pjd |
Use role2str() when setting process title.
MFC after: 1 week
|
219900 |
23-Mar-2011 |
pjd |
Don't create socketpair for connection forwarding between parent and secondary. Secondary doesn't need to connect anywhere.
MFC after: 1 week
|
219887 |
22-Mar-2011 |
pjd |
Add my copyright.
MFC after: 1 week
|
219882 |
22-Mar-2011 |
trociny |
After synchronization is complete we should make primary counters be equal to secondary counters:
primary_localcnt = secondary_remotecnt primary_remotecnt = secondary_localcnt
Previously it was done wrong and split-brain was observed after primary had synchronized up-to-date data from secondary.
Approved by: pjd (mentor) MFC after: 1 week
|
219879 |
22-Mar-2011 |
trociny |
For requests that are sent only to remote component use the error from remote. Approved by: pjd (mentor) MFC after: 1 week
|
219873 |
22-Mar-2011 |
pjd |
The proto API is a general purpose API, so don't use 'hast' in structures or function names. It can now be used outside of HAST.
MFC after: 1 week
|
219864 |
22-Mar-2011 |
pjd |
White space cleanups.
MFC after: 1 week
|
219847 |
21-Mar-2011 |
pjd |
When dropping privileges prefer capsicum over chroot+setgid+setuid. We can use capsicum for secondary worker processes and hastctl. When working as primary we drop privileges using chroot+setgid+setuid still as we need to send ioctl(2)s to ggate device, for which capsicum doesn't allow (yet).
X-MFC after: capsicum is merged to stable/8
|
219844 |
21-Mar-2011 |
pjd |
Initialize localcnt on first write. This fixes assertion when we create resource, set role to primary, do no writes, then sent it to secondary and accept connection from primary.
MFC after: 1 week
|
219843 |
21-Mar-2011 |
pjd |
Fix typo.
MFC after: 1 week
|
219837 |
21-Mar-2011 |
pjd |
Before handling any events on descriptors check signals so we can update our info about worker processes if any of them was terminated in the meantime.
This fixes the problem with 'hastctl status' running from a hook called on split-brain: 1. Secondary calls a hooks and terminates. 2. Hook asks for resource status via 'hastctl status'. 3. The main hastd handles the status request by sending it to the secondary worker who is already dead, but because signals weren't checked yet he doesn't know that and we get EPIPE.
MFC after: 1 week
|
219833 |
21-Mar-2011 |
pjd |
Remove stale comment. Yes, it is valid to set role back to init.
MFC after: 1 week
|
219832 |
21-Mar-2011 |
pjd |
Increase debug level of "Checking hooks." message.
MFC after: 1 week
|
219831 |
21-Mar-2011 |
pjd |
Be pedantic and free nvout before exiting.
MFC after: 1 week
|
219830 |
21-Mar-2011 |
pjd |
Detect situation where resource internal identifier differs. This means that both nodes have separately managed resources that don't have the same data.
MFC after: 1 week
|
219818 |
21-Mar-2011 |
pjd |
In hast.conf we define the other node's address in 'remote' variable. This way we know how to connect to secondary node when we are primary. The same variable is used by the secondary node - it only accepts connections from the address stored in 'remote' variable. In cluster configurations it is common that each node has its individual IP address and there is one addtional shared IP address which is assigned to primary node. It seems it is possible that if the shared IP address is from the same network as the individual IP address it might be choosen by the kernel as a source address for connection with the secondary node. Such connection will be rejected by secondary, as it doesn't come from primary node individual IP.
Add 'source' variable that allows to specify source IP address we want to bind to before connecting to the secondary node.
MFC after: 1 week
|
219817 |
21-Mar-2011 |
pjd |
Log when we start hooks checking and when we execute a hook.
MFC after: 1 week
|
219816 |
21-Mar-2011 |
pjd |
Use snprlcat() instead of two strlcat(3)s.
MFC after: 1 week
|
219815 |
21-Mar-2011 |
pjd |
Add snprlcat() and vsnprlcat() - the functions I'm always missing. They work as a combination of snprintf(3) and strlcat(3) - the caller can append a string build based on the given format.
MFC after: 1 week
|
219814 |
21-Mar-2011 |
pjd |
When creating connection on behalf of primary worker, set pjdlog prefix to resource name and role, so that any logs related to that can be identified properly.
MFC after: 1 week
|
219813 |
21-Mar-2011 |
pjd |
If there is any traffic on one of out descriptors, we were not checking for long running hooks. Fix it by not using select(2) timeout to decide if we want to check hooks or not.
MFC after: 1 week
|
219721 |
17-Mar-2011 |
trociny |
For secondary, set 2 * HAST_KEEPALIVE seconds timeout for incoming connection so the worker will exit if it does not receive packets from the primary during this interval.
Reported by: Christian Vogt <Christian.Vogt@haw-hamburg.de> Tested by: Christian Vogt <Christian.Vogt@haw-hamburg.de> Approved by: pjd (mentor) MFC after: 1 week
|
219669 |
15-Mar-2011 |
pjd |
Remove #include needed for debugging.
MFC after: 1 week
|
219482 |
11-Mar-2011 |
trociny |
Make workers inherit debug level from the main process.
Approved by: pjd (mentor) MFC after: 1 week
|
219385 |
07-Mar-2011 |
pjd |
Unbreak the build.
MFC after: 2 weeks
|
219372 |
07-Mar-2011 |
pjd |
- Log size of data to synchronize in human readable form (using %N). - Log synchronization time (using %T). - Log synchronization speed in human readable form (using %N).
MFC after: 2 weeks
|
219371 |
07-Mar-2011 |
pjd |
Use %S to print IP address and port number.
MFC after: 2 weeks
|
219370 |
07-Mar-2011 |
pjd |
- Turn on printf extentions. - Load support for %T for pritning time. - Add support for %N for printing number in human readable form. - Add support for %S for printing sockaddr structure (currently only AF_INET family is supported, as this is all we need in HAST). - Disable gcc compile-time format checking as this will no longer work.
MFC after: 2 weeks
|
219369 |
07-Mar-2011 |
pjd |
Provides three states for pjdlog_initialized, so we can also tell that this is fist initialization ever.
MFC after: 2 weeks
|
219354 |
06-Mar-2011 |
pjd |
Allow to compress on-the-wire data using two algorithms: - HOLE - it simply turns all-zero blocks into few bytes header; it is extremely fast, so it is turned on by default; it is mostly intended to speed up initial synchronization where we expect many zeros; - LZF - very fast algorithm by Marc Alexander Lehmann, which shows very decent compression ratio and has BSD license.
MFC after: 2 weeks
|
219351 |
06-Mar-2011 |
pjd |
Allow to checksum on-the-wire data using either CRC32 or SHA256.
MFC after: 2 weeks
|
218474 |
09-Feb-2011 |
pjd |
When we decide to unlink socket file, sun_path must be set. If it is set, but there is problem unlinking the file, log a warning.
MFC after: 1 week
|
218465 |
08-Feb-2011 |
pjd |
Explicitly include <sys/types.h> as suggested by getpid(2) and don't rely on <sys/un.h> including what's needed.
MFC after: 1 week
|
218464 |
08-Feb-2011 |
pjd |
Unlink UNIX domain socket file only if: 1. The descriptor is the one we are listening on (not the one when we connect as a client and not the one which is created on accept(2)). 2. Descriptor was created by us (PID matches with the PID stored on bind(2)).
Reported by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 1 week
|
218376 |
06-Feb-2011 |
pjd |
Now that we break the loop on fstat(2) failure we no longer need to satisfy gcc's imperfections.
MFC after: 1 week
|
218375 |
06-Feb-2011 |
pjd |
Add (void) cast before snprintf(3)s for which we are not interested in return values.
MFC after: 1 week
|
218374 |
06-Feb-2011 |
pjd |
Treat fstat(2) failure (different than EBADF) as fatal error.
Reported by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 1 week
|
218373 |
06-Feb-2011 |
pjd |
Open syslog when logging sysconf(3) failure.
Reported by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 1 week
|
218370 |
06-Feb-2011 |
pjd |
Close more descriptors that can be open if the worker process for the given resource is already running.
Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 1 week
|
218218 |
03-Feb-2011 |
pjd |
Setup another socketpair between parent and child, so that primary sandboxed worker can ask the main privileged process to connect in worker's behalf and then we can migrate descriptor using this socketpair to worker. This is not really needed now, but will be needed once we start to use capsicum for sandboxing.
MFC after: 1 week
|
218217 |
03-Feb-2011 |
pjd |
Add missing locking after moving keepalive_send() to remote send thread in r214692.
MFC after: 1 week
|
218214 |
03-Feb-2011 |
pjd |
Let the caller log info about successful privilege drop. We don't want to log this in hastctl.
MFC after: 1 week
|
218194 |
02-Feb-2011 |
pjd |
- Rename proto_descriptor_{send,recv}() functions to proto_connection_{send,recv} and change them to return proto_conn structure. We don't operate directly on descriptors, but on proto_conns. - Add wrap method to wrap descriptor with proto_conn. - Remove methods to send and receive descriptors and implement this functionality as additional argument to send and receive methods.
MFC after: 1 week
|
218193 |
02-Feb-2011 |
pjd |
Add proto_connect_wait() to wait for connection to finish. If timeout argument to proto_connect() is -1, then the caller needs to use this new function to wait for connection.
This change is in preparation for capsicum, where sandboxed worker wants to ask main process to connect in worker's behalf and pass descriptor to the worker. Because we don't want the main process to wait for the connection, it will start async connection and pass descriptor to the worker who will be responsible for waiting for the connection to finish.
MFC after: 1 week
|
218192 |
02-Feb-2011 |
pjd |
Allow to specify connection timeout by the caller.
MFC after: 1 week
|
218191 |
02-Feb-2011 |
pjd |
Move protocol allocation and deallocation to separate functions.
MFC after: 1 week
|
218185 |
02-Feb-2011 |
pjd |
Be prepared that hp_client or hp_server might be NULL now.
MFC after: 1 week
|
218158 |
01-Feb-2011 |
pjd |
Do not set socket send and receive buffer. It will be auto-tuned.
Confirmed by: rwatson MFC after: 1 week
|
218148 |
31-Jan-2011 |
pjd |
Fix build on ia64.
I found no way how to use CMSG_NXTHDR() macro on ia64 without alignment warnings.
MFC after: 1 week
|
218147 |
31-Jan-2011 |
pjd |
Until I fix the build on ia64 comment out problematic lines. Those lines are part of the (for now) unused functions.
|
218139 |
31-Jan-2011 |
pjd |
Implement two new functions for sending descriptor and receving descriptor over UNIX domain sockets and socket pairs. This is in preparation for capsicum.
MFC after: 1 week
|
218138 |
31-Jan-2011 |
pjd |
- Use pjdlog for assertions and aborts as this will log assert/abort message to syslog if we run in background. - Asserts in proto.c that method we want to call is implemented and remove dummy methods from protocols implementation that are only there to abort the program with nice message.
MFC after: 1 week
|
218132 |
31-Jan-2011 |
pjd |
Rename pjdlog_verify() to pjdlog_abort() as it better describes what the the function does and mark it with __dead2.
MFC after: 1 week
|
218049 |
28-Jan-2011 |
pjd |
Drop privileges in worker processes.
Accepting connections and handshaking in secondary is still done before dropping privileges. It should be implemented by only accepting connections in privileged main process and passing connection descriptors to the worker, but is not implemented yet.
MFC after: 1 week
|
218048 |
28-Jan-2011 |
pjd |
Implement function that drops privileges by: - chrooting to /var/empty (user hast home directory), - setting groups to 'hast' (user hast primary group), - setting real group id, effective group id and saved group id to 'hast', - setting real user id, effective user id and saved user id to 'hast'. At the end verify that those operations where successfull.
MFC after: 1 week
|
218045 |
28-Jan-2011 |
pjd |
Use newly added descriptors_assert() function to ensure only expected descriptors are open.
MFC after: 1 week
|
218044 |
28-Jan-2011 |
pjd |
Add function to assert that the only descriptors we have open are the ones we expect to be open. Also assert that they point at expected type.
Because openlog(3) API is unable to tell us descriptor number it is using, we have to close syslog socket, remember assert message in local buffer and if we fail on assertion, reopen syslog socket and log the message.
MFC after: 1 week
|
218043 |
28-Jan-2011 |
pjd |
Close all unneeded descriptors after fork(2).
MFC after: 1 week
|
218042 |
28-Jan-2011 |
pjd |
Add comments to places where we treat errors as ciritical, but it is possible to handle them more gracefully.
MFC after: 1 week
|
218041 |
28-Jan-2011 |
pjd |
Add function to close all unneeded descriptors after fork(2).
MFC after: 1 week
|
218040 |
28-Jan-2011 |
pjd |
Initialize all global variables on pjdlog_init().
MFC after: 1 week
|
217969 |
27-Jan-2011 |
pjd |
Remember created control connection so on fork(2) we can close it in child.
Found with: procstat(1) MFC after: 1 week
|
217967 |
27-Jan-2011 |
pjd |
Close the control socket before exiting, so it will be unlinked.
MFC after: 1 week
|
217966 |
27-Jan-2011 |
pjd |
Extend pjdlog_verify() to support the following additional macros: PJDLOG_RVERIFY() - always check expression and on false log the given message and exit. PJDLOG_RASSERT() - check expression when NDEBUG is not defined and on false log given message and exit. PJDLOG_ABORT() - log the given message and exit.
MFC after: 1 week
|
217965 |
27-Jan-2011 |
pjd |
Add functions to initialize/finalize pjdlog. This allows to open/close log file at will.
MFC after: 1 week
|
217964 |
27-Jan-2011 |
pjd |
Use my copyright for 2011 work.
MFC after: 1 week
|
217962 |
27-Jan-2011 |
pjd |
Add LOG_NDELAY flag to openlog(3) - we want descriptor to be immediately open so there are no surprises once we start chrooting or using capsicum.
MFC after: 1 week
|
217961 |
27-Jan-2011 |
pjd |
- Remove obvious NOTREACHED comment after abort() call. - Remove redundant newline at the end of the file.
MFC after: 1 week
|
217958 |
27-Jan-2011 |
pjd |
Remove __dead2 from pjdlog_verify() prototype, it does return sometimes.
MFC after: 1 week
|
217784 |
24-Jan-2011 |
pjd |
Don't open configuration file from worker process. Handle SIGHUP in the master process only and pass changes to the worker processes over control socket. This removes access to global namespace in preparation for capsicum sandboxing.
MFC after: 2 weeks
|
217737 |
22-Jan-2011 |
pjd |
Add missing logs.
MFC after: 1 week
|
217732 |
22-Jan-2011 |
pjd |
Add nv_assert() which allows to assert that the given name exists.
MFC after: 1 week
|
217731 |
22-Jan-2011 |
pjd |
Use more consistent function name with the others (pjdlogv_prefix_set() instead of pjdlog_prefix_setv()).
MFC after: 1 week
|
217730 |
22-Jan-2011 |
pjd |
Use int16 for error.
MFC after: 1 week
|
217729 |
22-Jan-2011 |
pjd |
- On primary worker reload, update hr_exec field. - Update comment.
MFC after: 1 week
|
217312 |
12-Jan-2011 |
pjd |
execve(2), not fork(2) resets signal handler to the default value (if it isn't ignored). Correct comment talking about that.
Pointed out by: kib MFC after: 3 days
|
217308 |
12-Jan-2011 |
pjd |
Add a note that when custom signal handler is installed for a signal, signal action is restored to default in child after fork(2). In this case there is no need to do anything with dummy SIGCHLD handler, because after fork(2) it will be automatically reverted to SIG_IGN.
Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com MFC after: 3 days
|
217307 |
12-Jan-2011 |
pjd |
Install default signal handlers before masking signals we want to handle. It is possible that the parent process ignores some of them and sigtimedwait() will never see them, eventhough they are masked.
The most common situation for this to happen is boot process where init(8) ignores SIGHUP before starting to execute /etc/rc. This in turn caused hastd(8) to ignore SIGHUP.
Reported by: trasz Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com MFC after: 3 days
|
216722 |
26-Dec-2010 |
pjd |
Detect when resource is configured more than once.
MFC after: 3 days
|
216721 |
26-Dec-2010 |
pjd |
When node-specific configuration is missing in resource section, provide more useful information. Instead of:
hastd: remote address not configured for resource foo
Print the following:
No resource foo configuration for this node (acceptable node names: freefall, freefall.freebsd.org, 44333332-4c44-4e31-4a30-313920202020).
MFC after: 3 days
|
216494 |
16-Dec-2010 |
pjd |
The 'ret' variable is of type ssize_t and we use proper format for it (%zd), so no (bogus) cast is needed.
MFC after: 3 days
|
216479 |
16-Dec-2010 |
pjd |
Improve problems logging.
MFC after: 3 days
|
216478 |
16-Dec-2010 |
pjd |
Don't ignore errors from remote requests.
MFC after: 3 days
|
216477 |
16-Dec-2010 |
pjd |
Log the fact of launching and include protocol version number.
MFC after: 3 days
|
215676 |
22-Nov-2010 |
brucec |
Don't generate input() since it's not used.
|
215332 |
15-Nov-2010 |
pjd |
Move timeout.tv_sec initialization outside the loop - sigtimedwait(2) won't modify it.
Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
|
215331 |
15-Nov-2010 |
pjd |
1. Exit when we cannot create incoming connection. 2. Improve logging to inform which connection can't be created.
Submitted by: [1] Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
|
214692 |
02-Nov-2010 |
pjd |
Send packets to remote node only via the send thread to avoid possible races - in this case a keepalive packet was send from wrong thread which lead to connection dropping, because of corrupted packet.
Fix it by sending keepalive packets directly from the send thread. As a bonus we now send keepalive packets only when connection is idle.
Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
|
214284 |
24-Oct-2010 |
pjd |
Before this change on first connect between primary and secondary we initialize all the data. This is huge waste of time and resources if there were no writes yet, as there is no real data to synchronize.
Optimize this by sending "virgin" argument to secondary, which gives it a hint that synchronization is not needed.
In the common case (where noth nodes are configured at the same time) instead of synchronizing everything, we don't synchronize at all.
MFC after: 1 week
|
214283 |
24-Oct-2010 |
pjd |
Implement nv_exists() function that returns true if argument of the given name exists.
MFC after: 3 days
|
214282 |
24-Oct-2010 |
pjd |
Move all NV defines into nv.c, they are not used externally thus there is no need to make then visible from outside.
MFC after: 3 days
|
214276 |
24-Oct-2010 |
pjd |
Simplify code a bit.
MFC after: 3 days
|
214275 |
24-Oct-2010 |
pjd |
Plug memory leak.
MFC after: 3 days
|
214274 |
24-Oct-2010 |
pjd |
Plug memory leaks.
Found with: valgrind MFC after: 3 days
|
214273 |
24-Oct-2010 |
pjd |
Load geom_gate.ko module after parsing arguments.
MFC after: 3 days
|
214119 |
20-Oct-2010 |
pjd |
Use closefrom(2) instead of close(2) in a loop.
MFC after: 1 week
|
213981 |
17-Oct-2010 |
pjd |
Log correct connection when canceling half-open connection.
Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
|
213939 |
16-Oct-2010 |
pjd |
Use one fprintf() instead of two.
MFC after: 3 days
|
213938 |
16-Oct-2010 |
pjd |
Clear signal mask before executing a hook.
Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
|
213580 |
08-Oct-2010 |
pjd |
We can't zero out ggio request, as we have some fields in there we initialize once during start-up.
Reported by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
|
213579 |
08-Oct-2010 |
pjd |
We close the event socketpair early in the mainloop to prevent spaming with error messages, so when we clean up after child process, we have to check if the event socketpair is still there.
Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
|
213533 |
07-Oct-2010 |
pjd |
Clear ggate structures before using them. We don't initialize all the field and there can be some garbage from the stack.
MFC after: 1 week
|
213531 |
07-Oct-2010 |
pjd |
Log error message when we fail to destroy ggate provider.
MFC after: 3 days
|
213530 |
07-Oct-2010 |
pjd |
Start the guard thread first, so we can handle signals from the very begining.
Reported by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 1 week
|
213529 |
07-Oct-2010 |
pjd |
Don't close local component on exit as we can hang waiting on g_waitidle. I'm unable to reproduce the race described in comment anymore and also the comment is incorrect - localfd represents local component from configuration file, eg. /dev/da0 and not HAST provider.
Reported by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 1 week
|
213430 |
04-Oct-2010 |
pjd |
Decrease report interval to 5 seconds, as this also means we will check for signals every 5 seconds and not every 10 seconds as before.
MFC after: 3 days
|
213429 |
04-Oct-2010 |
pjd |
hook_check() is now only used to report about long-running hooks, so the argument is redundant, remove it.
MFC after: 3 days
|
213428 |
04-Oct-2010 |
pjd |
We can't mask ignored signal, so install dummy signal hander for SIGCHLD before masking it.
This fixes bogus reports about hooks running for too long and other problems related to garbage-collecting child processes.
Reported by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
|
213183 |
26-Sep-2010 |
pjd |
Plug memory leak on fork(2) failure.
Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
|
213009 |
22-Sep-2010 |
pjd |
Switch to sigprocmask(2) API also in the main process and secondary process. This way the primary process inherits signal mask from the main process, which fixes a race where signal is delivered to the primary process before configuring signal mask.
Reported by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
|
213008 |
22-Sep-2010 |
pjd |
Assert that descriptor numbers are sane.
MFC after: 3 days
|
213007 |
22-Sep-2010 |
pjd |
Fix possible deadlock where worker process sends an event to the main process while the main process sends control message to the worker process, but worker process hasn't started control thread yet, because it waits for reply from the main process.
The fix is to start the control thread before sending any events.
Reported and fix suggested by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
|
213006 |
22-Sep-2010 |
pjd |
Fix descriptor leaks: when child exits, we have to close control and event socket pairs. We did that only in one case out of three.
MFC after: 3 days
|
213004 |
22-Sep-2010 |
pjd |
If we are unable to receive control message is most likely because the main process died. Instead of entering infinite loop, terminate.
MFC after: 3 days
|
213003 |
22-Sep-2010 |
pjd |
Sort includes.
MFC after: 3 days
|
212899 |
20-Sep-2010 |
pjd |
Add __dead2 to functions that we know they are going to exit.
MFC after: 3 days
|
212052 |
31-Aug-2010 |
pjd |
Include process PID in log messages.
Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 2 weeks
|
212051 |
31-Aug-2010 |
pjd |
Correct error message.
Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 2 weeks
|
212049 |
31-Aug-2010 |
pjd |
Forgot to add event.c and event.h in r212038.
Pointed out by: pluknet <pluknet@gmail.com> MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
212046 |
31-Aug-2010 |
pjd |
Mask only those signals that we want to handle.
Suggested by: jilles MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
212038 |
30-Aug-2010 |
pjd |
Because it is very hard to make fork(2) from threaded process safe (we are limited to async-signal safe functions in the child process), move all hooks execution to the main (non-threaded) process.
Do it by maintaining connection (socketpair) between child and parent and sending events from the child to parent, so it can execute the hook.
This is step in right direction for others reasons too. For example there is one less problem to drop privs in worker processes.
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
212037 |
30-Aug-2010 |
pjd |
We only want to know if descriptors are ready for reading.
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
212036 |
30-Aug-2010 |
pjd |
When someone gives NULL as data, assume this is because he want to declare connection side only.
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
212034 |
30-Aug-2010 |
pjd |
Use pjdlog_exit() before fork().
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
212033 |
30-Aug-2010 |
pjd |
Constify arguments we can constify.
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
211984 |
30-Aug-2010 |
pjd |
Execute hook when connection between the nodes is established or lost.
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
211983 |
30-Aug-2010 |
pjd |
Execute hook when split-brain is detected.
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
211982 |
30-Aug-2010 |
pjd |
Use sigtimedwait(2) for signals handling in primary process. This fixes various races and eliminates use of pthread* API in signal handler.
Pointed out by: kib With help from: jilles MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
211981 |
29-Aug-2010 |
pjd |
- Move functionality responsible for checking one connection to separate function to make code more readable. - Be sure not to reconnect too often in case of signal delivery, etc.
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
211979 |
29-Aug-2010 |
pjd |
Disconnect after logging errors.
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
211978 |
29-Aug-2010 |
pjd |
- Call hook on role change. - Document new event.
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
211977 |
29-Aug-2010 |
pjd |
Allow to run hooks from the main hastd process.
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
211976 |
29-Aug-2010 |
pjd |
- Add hook_fini() which should be called after fork() from the main hastd process, once it start to use hooks. - Add hook_check_one() in case the caller expects different child processes and once it can recognize it, it will pass pid and status to hook_check_one().
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
211975 |
29-Aug-2010 |
pjd |
Implement mtx_destroy() and rw_destroy().
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
211899 |
27-Aug-2010 |
pjd |
When SIGTERM or SIGINT is received, terminate worker processes.
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
211898 |
27-Aug-2010 |
pjd |
When logging to stdout/stderr, flush after each log.
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
211897 |
27-Aug-2010 |
pjd |
Correct when we log interrupted synchronization.
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
211896 |
27-Aug-2010 |
pjd |
Check if no signals were delivered just before going to sleep.
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
211895 |
27-Aug-2010 |
pjd |
Add hooks execution.
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
211887 |
27-Aug-2010 |
pjd |
Document new 'exec' parameter.
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
211886 |
27-Aug-2010 |
pjd |
Allow to execute specified program on various HAST events.
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
211885 |
27-Aug-2010 |
pjd |
- Run hooks in background - don't block waiting for them to finish. - Keep all hooks we're running in a global list, so we can report when they finish and also report when they are running for too long.
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
211884 |
27-Aug-2010 |
pjd |
When logging to stdout/stderr don't close those descriptors after fork().
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
211883 |
27-Aug-2010 |
pjd |
Reduce indent where possible.
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
211882 |
27-Aug-2010 |
pjd |
Implement keepalive mechanism inside HAST protocol so we can detect secondary node failures quickly for HAST resources that are rarely modified.
Remove XXX from a comment now that the guard thread never sleeps infinitely.
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
211881 |
27-Aug-2010 |
pjd |
- Remove redundant and incorrect 'old' word from debug message. - Log disconnects as warnings.
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
211880 |
27-Aug-2010 |
pjd |
Don't increase number synchronized bytes in case of an error.
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
211879 |
27-Aug-2010 |
pjd |
Log that synchronization was interrupted in a proper place.
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
211878 |
27-Aug-2010 |
pjd |
We have sync_start() function to start synchronization, introduce sync_stop() function to stop it.
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
211877 |
27-Aug-2010 |
pjd |
Add QUEUE_INSERT() and QUEUE_TAKE() macros that simplify the code a bit.
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
211876 |
27-Aug-2010 |
pjd |
Add mtx_owned() implementation.
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
211875 |
27-Aug-2010 |
pjd |
Make comment more readable.
MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
|
211452 |
18-Aug-2010 |
pjd |
For some setups sending data in 128kB chunks makes communication very slow. No idea why. 32kB on the other hand seems to work properly everywhere.
Reported by: Thomas Steen Rasmussen <thomas@gibfest.dk> MFC after: 3 weeks
|
211407 |
16-Aug-2010 |
pjd |
The 'size' variable is there to limit how many bytes we want to copy from 'addr'. It is very likely that size of 'addr' is larger than 'size', so checking strlcpy() return value is bogus.
MFC after: 3 weeks
|
211397 |
16-Aug-2010 |
joel |
Fix typos, spelling, formatting and mdoc mistakes found by Nobuyuki while translating these manual pages. Minor corrections by me.
Submitted by: Nobuyuki Koganemaru <n-kogane@syd.odn.ne.jp>
|
210892 |
05-Aug-2010 |
pjd |
Document 'none' value for remote.
Reviewed by: dougb MFC after: 1 month
|
210886 |
05-Aug-2010 |
pjd |
Implement configuration reload on SIGHUP. This includes: - Load added resources. - Stop and forget removed resources. - Update modified resources in least intrusive way, ie. don't touch /dev/hast/<name> unless path to local component or provider name were modified.
Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com MFC after: 1 month
|
210883 |
05-Aug-2010 |
pjd |
Prepare configuration parsing code to be called multiple times: - Don't exit on errors if not requested. - Don't keep configuration in global variable, but allocate memory for configuration. - Call yyrestart() before yyparse() so that on error in configuration file we will start from the begining next time and not from the place we left of.
MFC after: 1 month
|
210882 |
05-Aug-2010 |
pjd |
Make control_set_role() more public. We will need it soon.
MFC after: 1 month
|
210881 |
05-Aug-2010 |
pjd |
Allow to use 'none' keywork as remote address in case second cluster node is not setup yet.
MFC after: 1 month
|
210880 |
05-Aug-2010 |
pjd |
Reset signal handlers after fork().
MFC after: 1 month
|
210879 |
05-Aug-2010 |
pjd |
- Use pjdlog_exitx() to log errors and exit instead of errx(). - Use 'unable to' (instead of 'cannot') consistently.
MFC after: 1 month
|
210876 |
05-Aug-2010 |
pjd |
Assert that various buffers we are large enough.
MFC after: 1 month
|
210875 |
05-Aug-2010 |
pjd |
Problem with assertion is that it logs on stderr. Add two macros: PJDLOG_ASSERT() and PJDLOG_VERIFY() that will check the given condition and log the problem where appropriate. The difference between those two is that PJDLOG_VERIFY() always work and PJDLOG_ASSERT() can be turned off by defining NDEBUG.
MFC after: 1 month
|
210873 |
05-Aug-2010 |
pjd |
Keep $FreeBSD$ in __FBSDID() only for C files.
MFC after: 1 month
|
210872 |
05-Aug-2010 |
pjd |
Mark two more places that we won't reach.
MFC after: 1 month
|
210870 |
05-Aug-2010 |
pjd |
Now that TCP will be checked last we don't need any knowledge about other protocols.
MFC after: 1 month
|
210869 |
05-Aug-2010 |
pjd |
Add an argument to the proto_register() function which allows protocol to declare it is the default and be placed at the end of the queue so it is checked last.
MFC after: 1 month
|
210702 |
31-Jul-2010 |
joel |
Spelling fixes.
|
210368 |
22-Jul-2010 |
pjd |
Actually, only the fullsync mode is implemented, not memsync mode. Correct manual page.
MFC after: 3 days
|
209185 |
14-Jun-2010 |
pjd |
Correct various log messages.
Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
|
209184 |
14-Jun-2010 |
pjd |
Fix typos.
MFC after: 3 days
|
209183 |
14-Jun-2010 |
pjd |
Initialize gctl_seq for synchronization requests.
Reported by: hiroshi@soupacific.com Analysed by: Mikolaj Golub <to.my.trociny@gmail.com> Tested by: hiroshi@soupacific.com, Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
|
209182 |
14-Jun-2010 |
pjd |
Plug memory leak.
Found by: Coverity Prevent CID: 7057 MFC after: 3 days
|
209181 |
14-Jun-2010 |
pjd |
Plug memory leak.
Found by: Coverity Prevent CID: 7056 MFC after: 3 days
|
209180 |
14-Jun-2010 |
pjd |
Plug memory leak.
Found by: Coverity Prevent CID: 7051 MFC after: 3 days
|
209179 |
14-Jun-2010 |
pjd |
Plug memory leaks.
Found by: Coverity Prevent CID: 7052, 7053, 7054, 7055 MFC after: 3 days
|
209177 |
14-Jun-2010 |
pjd |
Remove macros that are not really needed. The idea was to have them in case we grow more descriptors, but I'll reconsider readding them once we get there.
Passing (a = b) expression to FD_ISSET() is bad idea, as FD_ISSET() evaluates its argument twice.
Found by: Coverity Prevent CID: 5243 MFC after: 3 days
|
209175 |
14-Jun-2010 |
pjd |
Eliminate dead code.
Found by: Coverity Prevent CID: 5158 MFC after: 3 days
|
208028 |
13-May-2010 |
uqs |
mdoc: move remaining sections into consistent order
This pertains mostly to FILES, HISTORY, EXIT STATUS and AUTHORS sections.
Found by: mdocml lint run Reviewed by: ru
|
207390 |
29-Apr-2010 |
pjd |
Default connection timeout is way too long. To make it shorter we have to make socket non-blocking, connect() and if we get EINPROGRESS, we have to wait using select(). Very complex, but I know no other way to define connection timeout for a given socket.
Reported by: hiroshi@soupacific.com MFC after: 3 days
|
207372 |
29-Apr-2010 |
pjd |
- Check if the worker process was killed by signal and restart it. - Improve logging.
Pointed out by: Garrett Cooper <yanefbsd@gmail.com> MFC after: 3 days
|
207371 |
29-Apr-2010 |
pjd |
Fix a problem where hastd will stuck in recv(2) after sending request to secondary, which died between send(2) and recv(2). Do it by adding timeout to recv(2) for primary incoming and outgoing sockets and secondary outgoing socket.
Reported by: Mikolaj Golub <to.my.trociny@gmail.com> Tested by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
|
207348 |
28-Apr-2010 |
pjd |
Restart worker thread only if the problem was temporary. In case of persistent problem we don't want to loop forever.
MFC after: 3 days
|
207347 |
28-Apr-2010 |
pjd |
Mark temporary issues as such.
MFC after: 3 days
|
207345 |
28-Apr-2010 |
pjd |
Use WEXITSTATUS() to obtain real exit code.
MFC after: 3 days
|
207343 |
28-Apr-2010 |
pjd |
Don't assume that "resource" property is in metadata.
Reported by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
|
207070 |
22-Apr-2010 |
pjd |
Fix compilation with WITHOUT_CRYPT or WITHOUT_OPENSSL options.
Reported by: Andrei V. Lavreniyuk <andy.lavr@reactor-xg.kiev.ua> MFC after: 3 days
|
206697 |
16-Apr-2010 |
pjd |
Fix log size calculation which caused message truncation.
Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
|
206696 |
16-Apr-2010 |
pjd |
Fix control socket leak when worker process exits.
Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
|
206669 |
15-Apr-2010 |
pjd |
Increase ggate queue size to maximum value. HAST was not able to stand heavy random load.
Reported by: Hiroyuki Yamagami MFC after: 3 days
|
205738 |
27-Mar-2010 |
pjd |
Don't hold connection lock when doing reconnects as it makes I/Os wait for connection timeouts.
Reported by: Kevin Day <toasty@dragondata.com>
|
204596 |
02-Mar-2010 |
uqs |
Remove redundant WARNS?=6 overrides and inherit the WARNS setting from the toplevel directory.
This does not change any WARNS level and survives a make universe.
Approved by: ed (co-mentor)
|
204352 |
26-Feb-2010 |
ru |
Fixed static linkage.
|
204177 |
21-Feb-2010 |
pjd |
Changing proto_socketpair.c compilation and linking order revealed a problem - we should simply ignore proto_server() if address doesn't start with socketpair://, and not abort.
|
204076 |
18-Feb-2010 |
pjd |
Please welcome HAST - Highly Avalable Storage.
HAST allows to transparently store data on two physically separated machines connected over the TCP/IP network. HAST works in Primary-Secondary (Master-Backup, Master-Slave) configuration, which means that only one of the cluster nodes can be active at any given time. Only Primary node is able to handle I/O requests to HAST-managed devices. Currently HAST is limited to two cluster nodes in total.
HAST operates on block level - it provides disk-like devices in /dev/hast/ directory for use by file systems and/or applications. Working on block level makes it transparent for file systems and applications. There in no difference between using HAST-provided device and raw disk, partition, etc. All of them are just regular GEOM providers in FreeBSD.
For more information please consult hastd(8), hastctl(8) and hast.conf(5) manual pages, as well as http://wiki.FreeBSD.org/HAST.
Sponsored by: FreeBSD Foundation Sponsored by: OMCnet Internet Service GmbH Sponsored by: TransIP BV
|