C Buffer in C

blind0ne · Nov 22, 2021

Hi,
Trying to learn how to code with C, working on some quizes right now on one website.
Please, help me to implement buffer. Here is the piece of example:

C:

struct document get_document(char* text) {
   
    char end = '\0';
    int i = 0;
    int stop = 0;
    char *buffer = malloc(99);
    while(stop != 1){
        
        // printf("%c : ", text[i]);
        // printf("%d;\n", i);
        i++;
        if(text[i] == '\0'){
            stop = 1;
        }
    }
    
    exit(0);
.....

How to properly work with empty buffers and store something in them?

eternal_noob · Nov 22, 2021

It's pretty unclear to me what you want to archieve.
What is struct document and why do you want to read your text buffer (char* text) into another buffer?

If you want to read the contents of a file into a buffer, try this:

C:

#include <stdio.h>
#include <stdlib.h>

int main()
{
    FILE *f = fopen("textfile.txt", "rb");

    // TODO: check if file has been opened successfully

    // get file size
    fseek(f, 0, SEEK_END);
    long fsize = ftell(f);
    fseek(f, 0, SEEK_SET);  // same as rewind(f);

    // read contents into buffer
    char *string = malloc(fsize + 1);
    fread(string, 1, fsize, f);
    fclose(f);

    // add null terminator to string
    string[fsize] = 0;

    // do something with string

    // important to avoid memory leaks
    free(string);

    return 0;
}

a6h · Nov 22, 2021

I think he's trying to implement some sort of tokenization. strtok(3) maybe?

eternal_noob · Nov 22, 2021

C:

#include <string.h>
#include <stdio.h>

int main () {
   char str[80] = "This is - www.tutorialspoint.com - website";
   const char s[2] = "-";
   char *token;
   
   /* get the first token */
   token = strtok(str, s);
   
   /* walk through other tokens */
   while( token != NULL ) {
      printf( " %s\n", token );
   
      token = strtok(NULL, s);
   }
   
   return(0);
}

See https://www.tutorialspoint.com/c_standard_library/c_function_strtok.htm

ayleid96 · Nov 23, 2021

I assume you want to append text to a variable/structure? I don't quite understand. I wrote this example to be interactive, i like nice presentations.

EDIT: I rarely use loops, i prefer nicely controlled labels.

C:

    char *buffer = malloc(1);
    char *user_input = malloc(sizeof(char) * 200);
START:
    fprintf(stdout, "Please input text('print' to print, 'quit' to quit): ");
    fscanf(stdin, "%[^\n]%*c", user_input);

    if(strcmp(user_input, "print") == 0){
        fprintf(stdout, "Buffer content: %s\n", buffer);
    }

    else if(strcmp(user_input, "quit") == 0){
        goto END;
    }

    else{
        buffer = realloc(buffer, sizeof(user_input) + sizeof(buffer));
        strcat(buffer, user_input);
    }

    goto START;
   
END:
    free(buffer);
    free(user_input);
    return 0;

mark_j · Nov 24, 2021

buffer=malloc(1) is unnecessary and a waste of a system call. Just assign it to null. When freeing, just test for null in case it hasn't been used.
fprintf(stdout....) is just printf(....). You're writing code in C but wasting bytes. This is very non-C.

You don't test for realloc(), malloc() failure. Naughty. Naughty. In realloc() you would assign the result to a temporary pointer, then test it for success. If good, then assign it to buffer. Otherwise a failure results in a lost/dangling buffer pointer.

fscanf() is ok for a toy program, but it's a dangerous function to use and use it correctly
I think you mean strlen() rather than sizeof(), unless you always want to add the size of a pointer on your hardware each iteration.
There may be more, but that's a start.

eternal_noob · Nov 24, 2021

ayleid96 said:
I rarely use loops, i prefer nicely controlled labels.

While i believe there are indeed use cases for label jumps with goto, the code you posted is a perfect example when NOT to use goto.

mark_j said:
When freeing, just test for null in case it hasn't been used.

Freeing a null pointer does no harm. Just free it.

mark_j · Nov 24, 2021

I disagree, free() is a ~~system~~ library call. If/else is an instruction. Waste is waste. However both can be optimised out by the compiler.
I know the standard says don't test so technically you're correct.
As to gotos, I think this is an ok case because it's compact code. Sure a loop/while would be preferable, but again, you probably use a variable to test; more waste.

Trim that fat.

eternal_noob · Nov 24, 2021

mark_j said:
Waste is waste.

Freeing a null pointer is a no-op.

The free function causes the space pointed to by ptr to be deallocated, that is, made available for further allocation. If ptr is a null pointer, no action occurs.

http://www.open-std.org/JTC1/SC22/wg14/www/docs/n1124.pdf

mark_j said:
I know the standard says don't test so technically you're correct.

This is all that matters.

shkhln · Nov 24, 2021

free isn't a syscall, munmap is.

a6h · Nov 24, 2021

Some notes on free() and malloc():

There are two groups of free() functions, belong to the kernel memory management interfaces subroutines, i.e. under the intro(9).

malloc(9) and free(9) are the general memory allocator for dynamic memory allocation. As oppose to the C library routines from intro(3) and even systcalls from intro(2), these functions from intro(9) are very OS-specific. Compare it between FreeBSD and OpenBSD. There is also the zfree() function. It first zeros the memory and then release it.

In the FreeBSD there's another type of allocator (zone allocator) which is responsible for allocation dynamic memory for process and thread structure in the kernel, for example. The uma_zalloc() and uma_zfree() function are from this type of functions, and their description are documented under uma(9)

kpedersen · Nov 25, 2021

My personal experience is that it is almost pretty rare to call free(). Usually you will be working with higher level APIs such as gtk_widget_destroy, OCIHandleFree, sqlite3_close, etc

It is complete guesswork if these functions test for NULL so frankly it is just easier and reduces cognitive workload if you just test without thinking.

One thing that I don't tend to bother with is to NULL out the dangling pointer after a free. It seems a little bit naive to me because there is likely to be other references to it somewhere that you can't easily NULL out.

ayleid96 · Nov 26, 2021

eternal_noob said:
While i believe there are indeed use cases for label jumps with goto, the code you posted is a perfect example when NOT to use goto.

Freeing a null pointer does no harm. Just free it.

Tell me why then. I am not that experienced programmer, but everything i have done up to this point works completely fine.

EDIT: Why not to use labels for simple things like this?

Alain De Vos · Nov 26, 2021

while (true) { ...}

eternal_noob · Nov 26, 2021

ayleid96 said:
Why not to use labels for simple things like this?

It's all about semantics. I don't know if you know web development, but a simple analogy would be the use of a heading.
You don't use the <b><font-size> conglomerate but the dedicated element <h1>.

Same in C. If you want to do something while a condition is true, you use the do ... while construct. It's just more readable and clear.

ayleid96 said:
everything i have done up to this point works completely fine.

This is not a criterion. A mediocre programmer writes programs which computers can understand. A good programmer writes programs which humans can understand.

a6h · Nov 26, 2021

ayleid96 said:
Tell me why then. I am not that experienced programmer, but everything i have done up to this point works completely fine.

I think you should wait awhile, gain experience in C, and I bet the GOTO or not-GOTO won't bother you anymore. It simply won't come up. There are hypothetical situations, though. For example, what if I have a 10-layerd nested loop? The answer is how did you get there. Maybe you should use a look-up table. For now, just take the Dijkstra good-old advice, and don't use it.

Then, there are some important questions, which will come up, and they are important, but not in every context. Not everyone has to deal with. For example, should I use local variables, uninitialized or initialized static variables. In relation to the elf(5), the way kernel maps different portions of an executable into the address space could impact the final result -- there are different sections in an executable file, to hold different data. But again, in most cases, it won't show up. Some people have to take those situations in consideration, but for the most of situations, the compiler handles it perfectly fine. You will cross the bridge when you come to it.

Zvoni · Nov 26, 2021

eternal_noob said:
This is not a criterion. A mediocre programmer writes programs which computers can understand. A good programmer writes programs which humans can understand.

I saw a quote once:
Code is like humor: If you have to explain it, it's bad.

As for the GOTO-discussion: Avoid it like the plague.
With a GOTO you break out of ANY controlled code-flow.
The problem itself is not the Goto-Statement, it's returning back from where it's been called

eternal_noob · Nov 26, 2021

Zvoni said:
Code is like humor: If you have to explain it, it's bad.

Yes. Most programmers forget about self-documenting code.

ayleid96 · Nov 28, 2021

mark_j said:
buffer=malloc(1) is unnecessary and a waste of a system call. Just assign it to null. When freeing, just test for null in case it hasn't been used.
fprintf(stdout....) is just printf(....). You're writing code in C but wasting bytes. This is very non-C.
You don't test for realloc(), malloc() failure. Naughty. Naughty. In realloc() you would assign the result to a temporary pointer, then test it for success. If good, then assign it to buffer. Otherwise a failure results in a lost/dangling buffer pointer.

fscanf() is ok for a toy program, but it's a dangerous function to use and use it correctly
I think you mean strlen() rather than sizeof(), unless you always want to add the size of a pointer on your hardware each iteration.
There may be more, but that's a start.

I thought realloc() required already allocated memory. Thanks for that. I don't want to change fprintf, because i can specify stream, i want to standardize of something and use it as generally as possible. According to that logic even printf is a waste of memory. We should use syscall() { syscall(SYS_write, 1, "Print MSG", 14); }.

> You don't test for realloc(), malloc() failure. Naughty. Naughty.
Its a toy program. Yep..

You gave me some good points which i appreciate greatly. Thanks. But i think you are exaggerate a little.
--------------

I could work on code's self documentation. That's true..
I still don't understand why i should not use labels for simple things. It seems to me like its all about someone's opinion. I completely agree when you have more complex problems to solve in loops but for this labels are just fine.

eternal_noob · Nov 28, 2021

ayleid96 said:
I still don't understand why i should not use labels for simple things.

Then you should read my answer again. It perfectly makes sense.

ayleid96 · Nov 28, 2021

eternal_noob said:
Then you should read my answer again. It perfectly makes sense.

So semantics is your argument. I was always going with "A good computer programmer writes code that computer can understand." By my logic that code is more efficient/faster and by helping programmers to understand the code is making more burdened results with extra bytes(OOP for example).

a6h · Nov 28, 2021

ayleid96 said:
According to that logic even printf is a waste of memory. We should use syscall() { syscall(SYS_write, 1, "Print MSG", 14); }.

Syscalls are expensive. Syscall demands switching execution mode, from user-mode to kernel-mode. Some library functions use syscall, some don't, e.g. strlen(3), thus less expensive.

Library functions generally have higher abstractions. e.g. Both fopen(3) and open(2) request syscall, but while fopen(3) is buffered, open(2) is not.

Library function also are more portable. printf(3) is printf(3) in both FreeBSD and OpenBSD. But it's not always the case for syscalls. For example, OpenBSD has adjfreq(2) to improve the ntpd -- by slight adjustments in the clock frequency, but AFAIK there's no adjfreq(2) in FreeBSD.

If a task can be done without using context switching, then it is faster. Library functions are *generally* faster. Also they run in the user-space, thus they run with less privileges.

ayleid96 · Nov 28, 2021

vigole said:
Syscalls are expensive. Syscall demands switching execution mode, from user-mode to kernel-mode. Some library functions use syscall, some don't, e.g. strlen(3), thus less expensive.

Library functions generally have higher abstractions. e.g. Both fopen(3) and open(2) request syscall, but while fopen(3) is buffered, open(2) is not.

Library function also are more portable. printf(3) is printf(3) in both FreeBSD and OpenBSD. But it's not always the case for syscalls. For example, OpenBSD has adjfreq(2) to improve the ntpd -- by slight adjustments in the clock frequency, but AFAIK there's no adjfreq(2) in FreeBSD.

If a task can be done without using context switching, then it is faster. Library functions are *generally* faster. Also they run in the user-space, thus they run with less privileges.

Thank you for clearing that up!

a6h · Nov 28, 2021

I forgot to mention that, include/ctype.h is a good case example for how to reduce fuction call, by using macros -- of course in C (pay attention to those #ifndef in the header).

C Buffer in C

Attachments